Re: OSGi versions of Add-on Annotators

Jörn Kottmann Wed, 20 Jul 2011 11:35:40 -0700

On 7/20/11 7:56 PM, Marshall Schor wrote:

The "normal" way of having annotators together is something that UIMA supports,
as a pipeline.  Part of this is setting up the pipeline at initialization time
by taking all the type systems declared by the annotators in the pipeline, and
merging them into one common type system.


A CAS is generated using this one common type system, and then sent through the
pipeline.

Yes, this of course works, but it is often problematic, because themerged type system

needs to be suitable for all components.

Lets say we have a tokenizer and a pos tagger, the pos tagger needs theoutput

of the tokenizer as input. Therefore in UIMA you would declare a token type,
and both AEs must use exactly the same token type.

Now both AEs are made by different vendors, and both decide to declare their
own token type. Then this type system merging doesn't work.

As far as I know the only common used work around for this issue is, notto useJCas and to define type system mappings, where the types the AE needsare mapped

based on some configuration.

I think a solution to this problem is, to stop doing this type systemmerging, and alwaysmap one common type system to every Annotators private type system. Thismappingcould give the AEs more flexibility and might even be able to performsimple type transformations.

That would also make using JCas attractive again.

This issue is even amplified by the fact that our users like to definetheir own type system,and then they only work properly if the AE implementers do type systemmapping or programagainst this type system. The later case only work if the user andimplementer is the same

person/organization.

-----------

In the case where each annotator is "bundled" as a OSGi bundle, that bundle
contains its own private copy of all the UIMA classes, including all of the UIMA
SDK, and any type system, etc.  Any JCAS generated classes are also private to
that bundle.

This might make sense for running one Annotator by itself.


Exactly.

  But for running
multiple annotators together, as separate OSGi components, I don't see how it
would "work" if each annotator were its own bundle.  How would the type systems
be combined at initialization time?  How would you share the JCAS generated
classes?  (I'll admit that this is not *required*, but is sometimes useful.)

Does one of the Clerezza scenarios involve running multiple annotators, each
having its own bundle?  If so, how does that work?   (I'm guessing that there is
some "driver" code that uses UIMA Application APIs to separately initialize each
annotator,  and then maybe does something like getting a type system from all of
them, and merging them, and then creating a CAS from that, etc.  This is just
duplicating what the UIMA framework is doing - if it were "in charge" of the
pipeline and its management.)

Thanks for the clarifications.


These are all points which don't really work out
in the end (with our current release).

Jörn

Re: OSGi versions of Add-on Annotators

Reply via email to