I think this could be a good new kind of donation for the sandbox. Perhaps we could have a collection of these and by their being present and available for easy download, ones that are more generally of use and interest to the community could gradually evolve.
So I'm +1 for this kind of donation, especially since this particular one has been actively used by several groups already. -Marshall Burn Lewis wrote: > We would like to add a UIMA type system and sample annotators to the Apache > Incubator project as an example of a rich multimodal application. Our hope > is that others will find the techniques and types useful, and will find it a > good starting point for developing other multimodal applications. > > GTS is a type system designed for multi-modal applications that combine > analytics from multiple sources and modalities, such as speech recognition, > language translation, entity detection. etc. It is currently used by 10 > cooperating groups participating in the Darpa GALE project ( > http://www.darpa.mil/ipto/programs/gale/gale.asp) to transcribe, translate, > and extract information from foreign language news broadcasts. This > application requires that all the data is cross-referenced so that, for > example, any English sentence can be traced back to the precise region of > foreign language audio that generated it. > > The CAS organization and type system have been designed to allow each > analytic to easily work on data of the appropriate modality. Speech > recognition engines annotate an audio view with words aligned to a time > axis; machine translation annotates a text view of foreign sentences with > their English translation; entity detection annotates a text view of the > English sentences. Multiple analytics of each type may be employed to > improve the overall accuracy. > > The sample code includes data reorganization components that are inserted > between the different analytics to perform the necessary bookkeeping of > creating views and cross-reference links from one view back to an earlier > one. e.g. after all speech recognition analytics have run, a reorg module > creates a source-language text view for each STT engine, along with > cross-reference annotations from each word in the new view back to the > appropriate time span in the audio view. One reorg component is a CAS > Multiplier that resegments the initial fixed-length audio segments at likely > story boundaries so that later components can treat each CAS as a complete > story. The STT and MT analytics are simulated analytics that read their > results from a file, so that a complete pipeline of components can be > tested. > > We welcome any comments or suggestions or questions! > > - Burn. > >
