Re: OSGi versions of Add-on Annotators

Marshall Schor Wed, 20 Jul 2011 10:56:43 -0700

The "normal" way of having annotators together is something that UIMA supports,
as a pipeline.  Part of this is setting up the pipeline at initialization time
by taking all the type systems declared by the annotators in the pipeline, and
merging them into one common type system.

A CAS is generated using this one common type system, and then sent through the
pipeline.

-----------

In the case where each annotator is "bundled" as a OSGi bundle, that bundle
contains its own private copy of all the UIMA classes, including all of the UIMA
SDK, and any type system, etc.  Any JCAS generated classes are also private to
that bundle.

This might make sense for running one Annotator by itself.  But for running
multiple annotators together, as separate OSGi components, I don't see how it
would "work" if each annotator were its own bundle.  How would the type systems
be combined at initialization time?  How would you share the JCAS generated
classes?  (I'll admit that this is not *required*, but is sometimes useful.)

Does one of the Clerezza scenarios involve running multiple annotators, each
having its own bundle?  If so, how does that work?   (I'm guessing that there is
some "driver" code that uses UIMA Application APIs to separately initialize each
annotator,  and then maybe does something like getting a type system from all of
them, and merging them, and then creating a CAS from that, etc.  This is just
duplicating what the UIMA framework is doing - if it were "in charge" of the
pipeline and its management.)

Thanks for the clarifications.

-Marshall 

On 7/20/2011 12:17 PM, Tommaso Teofili wrote:
> 2011/7/20 Marshall Schor <m...@schor.com>
>
>> This may be all wrong-headed - but I wonder if the basic use case is to do
>> something like the following: Take a bunch of annotators (and maybe flow
>> controllers) together with a top-level aggregate XML specifying parameter
>> overrides, etc., and "wrap" them so they become a single OSGi bundle, that
>> can
>> then be embedded in an OSGi container?  If so, then perhaps instead of
>> having a
>> "set" of individually OSGi-i-fied annotators, like we do now, maybe we
>> should
>> have instead a tool that does this for a set of annotators, etc.
>>
> the use case in Clerezza is slightly different as it allows both the
> scenario where one executes an existing pipeline (using OpenCalaisAnnotator
> and AlchemyAPIAnnotator) and the scenario when one runs a custom pipeline,
> eventually using other existing UIMA components, defined in another bundle.
> I still think having individual OSGi versions of each annotator would be
> better.
>
>
> 2011/7/20 Marshall Schor <m...@schor.com>
>
>>
>> On 7/20/2011 11:18 AM, Marshall Schor wrote:
>>> On 7/20/2011 8:13 AM, Jörn Kottmann wrote:
>>>> On 7/20/11 1:55 PM, Marshall Schor wrote:
>>>>> What does it mean to "deploy" inside of an Apache Felix instance?
>>>> I did that once, and simply embedded everything in one bundle, even UIMA
>>>> itself. This way I could use UIMA plus some AEs to do analysis as a
>> service
>>>> for other OSGi bundles inside Felix.
>>> This suggests having a tool to make this "easy"; but also suggests that
>> having
>>> individual addon annotators packaged up as a "complete UIMA pipeline" may
>> not be
>>> very interesting to anyone.
>>>
>>> Is this right?  If so, perhaps we should not release this osgi versions
>> in the
>>> addons at this time.
> Do you mean not in the binary package or not release them at all (i.e. not
> deploying them on Maven central too)?
>
> Tommaso
>
>  That also would reduce the size of the distribution
>>> considerably (about 100 MB of 150 MB is for the OSGi versions).
>> oops, I was wrong - delete the following...
>>>  In computing
>>> this, I also noticed that the tagger osgi packaging was missing the 19.5
>> mb of
>>> statistical models...
>>>
>>> -Marshall
>>>> Jörn
>>>>

Re: OSGi versions of Add-on Annotators

Reply via email to