On 7/22/11 4:33 PM, Marshall Schor wrote:
A couple of thoughts:
In real world instances that I've seen, there are complexities that make simple
renaming of types / features not sufficient to make many completely
independently developed annotators inter-operate. This can be for many
reasons, including things like a confidence expressed in one system as a "float"
between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also a
reversed scale), or in a third as a string set "low", "medium", "high", etc.
So to combine independently developed annotators often takes writing some little
"glue" annotators inbetween, that do quite arbitrary things. I've heard stories
about people using the BSFannotator to write little scripts to do this.
Yes, you are right here, integration can be a bit more than just mapping
something. Could writing glue code be easier when we offer special
support for it?
Currently the merged type system can get very complex and is somehow needed
if the CASes are serialized and later maybe opened in the Cas Editor or used
in some other way. In my opinion that makes it difficult to handle CASes.
If we have one user type system and then one per annotator it would be
easier to understand it,
and only types from the user type system would occur in the CAS. In
other words
the annotator type system stays encapsulated and is not carried on to
later uses
of the CAS or cause compatibility issues at a later point in time. In my
observation that
is the way many try to use UIMA.
Using JCas could then be also more attractive to AE implementers if it could
be used by some type system mappings, instead of always requiring glue code,
even for simple cases.
Another thought: if it turned out there was a substantial use case for very
simple renaming of types/ features, that could be very efficiently supported by
the framework if we added support for aliases type specifications- this would be
a special kind of type definition that "mapped" to another one. However, as
I've suggested above, I don't think that this kind of thing would cover enough
of the real use cases to be worth the additional complexity.
I think for quite some AEs that could be useful. There are many simple
AEs which could
be integrated by this approach.
I also do not see the advantage of the current type system merging approach.
Jörn