On 11/24/2010 10:32, Jörn Kottmann wrote:
> On 11/23/10 11:44 PM, Burn Lewis wrote:
>>   Application skeleton for multi-modal NLP analysis
>>
>> There has been renewed interest in the typesystem and annotators developed
>> as part of the Darpa GALE project to demonstrate how to combine analytics
>> from multiple sources and modalities.  The GALE Interoperabilty
>> Demonstration system (IOD) uses UIMA-AS to interconnect 11 different types
>> of NLP analytics distributed over 7 research facilities in 3 countries to
>> transcribe, translate, and extract information from foreign language news
>> broadcasts.
>>
>> To aid the development of other multi-modal applications I plan to publish a
>> skeleton of this application in the sandbox.  It will eventually include the
>> following:
>>   - the UIMA typesystem that was developed to allow each analytic to operate
>> on an appropriate view of the data with no dependencies on its origin,
>>   - simulated analytics for the NLP engines,
>>   - data reorganization annotators that convert the outputs of one analytic
>> into a form suitable for  input to another,
>>   - descriptors and a flow controller that use the features of UIMA-AS to run
>> similar analytics in parallel, and to scale-out the slowest components.
>>
>> The goal will be to provide a complete example of a system that converts
>> audio in one language to text in another, segmented into topics.  Although
>> no real NLP analytics will be included, users can use the simulated ones as
>> examples of how to use the typesystem to wrap an NLP analytic as a UIMA
>> annotator.
>>
>> Any comments and suggestions would be welcome ... I hope to get started next
>> week.
> 
> Sounds very interesting, having such a sample will help people to understand
> how UIMA can be used with non-text sofas and how to write AEs for these.
> I think that is one of the reasons why there is no (open source) integration 
> for
> speech
> recognition or OCR. BTW, both are available in a compatible license, CMU 
> Sphinx
> and Tesseract/Ocropus.

Sounds very interesting, and will be even better when
it integrates those OS libraries to create a working
solution.

--Thilo

> 
> Jörn
> 

Reply via email to