Andrew,

Let me address the question about Python and Perl. The UIMA interface
to these languages (and Tcl) were created to support the use of
speech-to-text analytics whose top-level interface used scripting
languages as glue code to drive complex C++ modules. The motivation
here was to be able to easily UIMA-fy existing analytics in these
languages for a research application. The CAS interface code used was
not much more than that given in the scriptator samples, which
demonstrate both creating and accessing annotations.

The overhead of moving the CAS back and forth between Java and the
native environment is virtually negligible except for the most
light-weight C++ analytics. Not sure if this would ever be a factor
for these interpreted languages.

For those applications that cannot or do not want to use the JNI, I
expect that native C++ service wrappers will be available.

Regards,
Eddie

On 6/6/07, Andrew Borthwick <[EMAIL PROTECTED]> wrote:
All,

I'm studying whether UIMA would make sense to add to my company's
architecture and would appreciate it if anyone could point me to
either of the following.  We are considering using UIMA as a framework
on which to build a web-scale NLP pipeline which would involve
components like sentence boundary identification, tokenization, named
entity identification, and phrase chunking.

1.  Substantive examples of Python or Perl annoators, preferably
something that both inputs and outputs consumes annotations.
2.  More generally, could anybody point me to the use of UIMA in
fielded industrial applications outside of IBM?  I'd be particularly
interested in talking to someone who had evaluated various
alternatives and decided to go with UIMA.  Anybody who decided to go
with a different framework after evaulating UIMA would be helpful too.

Thanks,
Andrew Borthwick
Principal Scientist
Spock Networks

Reply via email to