Re: OSGi versions of Add-on Annotators

Jörn Kottmann Wed, 20 Jul 2011 12:21:07 -0700

On 7/20/11 8:57 PM, Marshall Schor wrote:

>
>
>  Now both AEs are made by different vendors, and both decide to declare their
>  own token type. Then this type system merging doesn't work.
>
>  As far as I know the only common used work around for this issue is, not to 
use
>  JCas and to define type system mappings, where the types the AE needs are 
mapped
>  based on some configuration.

I don't understand why JCas cannot be used -- that seems to me to be independent
of the need for having type system mappings.  I'm thinking that one annotator
produces a.b.Token, and a down-stream annotator needs c.d.Token with some
different kinds of meanings assigned to features - in this case you introduce a
custom mapping annotator, that iterates over the a.b.Token(s), and makes the
corresponding c.d.Token feature structures.  JCas can be used for both of these,
as desired.

Ok, that is possible, but this way you start writing code, for somethingthe frameworkcould do. And maintaining all kind of type system mapping AEs isn'treally fun either.

>
>
>  I think a solution to this problem is, to stop doing this type system 
merging,
>  and always
>  map one common type system to every Annotators private type system.

The hard part is getting a community to agree to "one common type system", I
think.   But we have seen in large projects, that this often can be done, within
one project.

Other times, groups working collaboratively, have gotten together and defined a
common type system for their work.

My point is that a user defines his own type system, and a mapping whichtranslates parts

of this type system to the annotator type system.

So in the sample above a user defines this type system:

Type: com.foo.Token
Feature: double tokenConfidence
Feature: String posTag
Feature: double posConfidence

The tokenizer also defined its type system:
Type: opennlp.Token
Feature: float confidence

And one more type system for the pos tagger:
Type: opennlp.POSToken
Feature: float confidence
Feature: String tag

The user defined AAE only knows the user type system and needs to
define "rules" which tell it how to transform opennlp.Token annotations
to com.foo.Token annotations, and then it needs a rule to transform
a com.foo.Token into an opennlp.POSToken, and back.

Sure this is also already possible today, by writing these type mapping AEs,

as you would need to do for JCas. But I think having better frameworksupport

for this would make it easier.

Jörn

Re: OSGi versions of Add-on Annotators

Reply via email to