I think this could be a good new kind of donation for the sandbox. 
Perhaps we could have a collection of these and by their being present
and available for easy download, ones that are more generally of use and
interest to the community could gradually evolve.

So I'm +1 for this kind of donation, especially since this particular
one has been actively used by several groups already.

-Marshall

Burn Lewis wrote:
> We would like to add a UIMA type system and sample annotators to the Apache
> Incubator project as an example of a rich multimodal application. Our hope
> is that others will find the techniques and types useful, and will find it a
> good starting point for developing other multimodal applications.
>
> GTS is a type system designed for multi-modal applications that combine
> analytics from multiple sources and modalities, such as speech recognition,
> language translation, entity detection. etc.  It is currently used by 10
> cooperating groups participating in the Darpa GALE project (
> http://www.darpa.mil/ipto/programs/gale/gale.asp) to transcribe, translate,
> and extract information from foreign language news broadcasts.  This
> application requires that all the data is cross-referenced so that, for
> example, any English sentence can be traced back to the precise region of
> foreign language audio that generated it.
>
> The CAS organization and type system have been designed to allow each
> analytic to easily work on data of the appropriate modality.  Speech
> recognition engines annotate an audio view with words aligned to a time
> axis; machine translation annotates a text view of foreign sentences with
> their English translation; entity detection annotates a text view of the
> English sentences.  Multiple analytics of each type may be employed to
> improve the overall accuracy.
>
> The sample code includes data reorganization components that are inserted
> between the different analytics to perform the necessary bookkeeping of
> creating views and cross-reference links from one view back to an earlier
> one.  e.g. after all speech recognition analytics have run, a reorg module
> creates a source-language text view for each STT engine, along with
> cross-reference annotations from each word in the new view back to the
> appropriate time span in the audio view.  One reorg component is a CAS
> Multiplier that resegments the initial fixed-length audio segments at likely
> story boundaries so that later components can treat each CAS as a complete
> story.  The STT and MT analytics are simulated analytics that read their
> results from a file, so that a complete pipeline of components can be
> tested.
>
> We welcome any comments or suggestions or questions!
>
> - Burn.
>
>   

Reply via email to