On 11/23/2010 5:44 PM, Burn Lewis wrote: > Application skeleton for multi-modal NLP analysis > > There has been renewed interest in the typesystem and annotators developed > as part of the Darpa GALE project to demonstrate how to combine analytics > from multiple sources and modalities. The GALE Interoperabilty > Demonstration system (IOD) uses UIMA-AS to interconnect 11 different types > of NLP analytics distributed over 7 research facilities in 3 countries to > transcribe, translate, and extract information from foreign language news > broadcasts. > > To aid the development of other multi-modal applications I plan to publish a > skeleton of this application in the sandbox. It will eventually include the > following: > - the UIMA typesystem that was developed to allow each analytic to operate > on an appropriate view of the data with no dependencies on its origin, > - simulated analytics for the NLP engines, > - data reorganization annotators that convert the outputs of one analytic > into a form suitable for input to another, > - descriptors and a flow controller that use the features of UIMA-AS to run > similar analytics in parallel, and to scale-out the slowest components. > > The goal will be to provide a complete example of a system that converts > audio in one language to text in another, segmented into topics. Although > no real NLP analytics will be included, users can use the simulated ones as > examples of how to use the typesystem to wrap an NLP analytic as a UIMA > annotator. > > Any comments and suggestions would be welcome ... I hope to get started next > week.
Sounds like it would be a nice pedagogical example for people to study and build upon. -Marshall > Burn >
