On 7/25/2011 6:40 AM, Jörn Kottmann wrote: > On 7/22/11 4:33 PM, Marshall Schor wrote: >> A couple of thoughts: >> >> In real world instances that I've seen, there are complexities that make >> simple >> renaming of types / features not sufficient to make many completely >> independently developed annotators inter-operate. This can be for many >> reasons, including things like a confidence expressed in one system as a >> "float" >> between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also >> a >> reversed scale), or in a third as a string set "low", "medium", "high", etc. >> >> So to combine independently developed annotators often takes writing some >> little >> "glue" annotators inbetween, that do quite arbitrary things. I've heard >> stories >> about people using the BSFannotator to write little scripts to do this. > Yes, you are right here, integration can be a bit more than just mapping > something. Could writing glue code be easier when we offer special support for > it?
Maybe, but I don't have a good grasp of the frequent use-cases here. In my limited experience, writing the glue code is very easy for the "easy" cases, and for the other cases, I'm not sure what special support we could come up with to make that easier. > > Currently the merged type system can get very complex and is somehow needed > if the CASes are serialized and later maybe opened in the Cas Editor or used > in some other way. In my opinion that makes it difficult to handle CASes. > > If we have one user type system and then one per annotator it would be easier > to understand it, > and only types from the user type system would occur in the CAS. In other > words > the annotator type system stays encapsulated and is not carried on to later > uses > of the CAS or cause compatibility issues at a later point in time. In my > observation that > is the way many try to use UIMA. I'm not sure what you mean by a type system encapsulated with an annotator. It seems to me that if you have a primitive annotator, and have types defined for it, which are *not* used by other annotators, then you are putting things into the CAS which are not used by others - so why put them in the CAS? Other Java (or C++, etc.) techniques are probably better for local storage while an annotator is running on a CAS. But perhaps I misunderstand what you're getting at? Perhaps a more concrete example would help me comprehend :-) -Marshall > > Using JCas could then be also more attractive to AE implementers if it could > be used by some type system mappings, instead of always requiring glue code, > even for simple cases. > >> Another thought: if it turned out there was a substantial use case for very >> simple renaming of types/ features, that could be very efficiently supported >> by >> the framework if we added support for aliases type specifications- this >> would be >> a special kind of type definition that "mapped" to another one. However, as >> I've suggested above, I don't think that this kind of thing would cover >> enough >> of the real use cases to be worth the additional complexity. >> > > I think for quite some AEs that could be useful. There are many simple AEs > which could > be integrated by this approach. > > I also do not see the advantage of the current type system merging approach. > > Jörn > >