On 7/22/2011 4:46 AM, Jörn Kottmann wrote: > Hi all, > > what do you think about these per AE type system mappings? > Is it something which would improve the current situation ? > Any concerns? >
A couple of thoughts: In real world instances that I've seen, there are complexities that make simple renaming of types / features not sufficient to make many completely independently developed annotators inter-operate. This can be for many reasons, including things like a confidence expressed in one system as a "float" between 0 and 1, and in another as an "integer" between 100 and 0 (yes, also a reversed scale), or in a third as a string set "low", "medium", "high", etc. So to combine independently developed annotators often takes writing some little "glue" annotators inbetween, that do quite arbitrary things. I've heard stories about people using the BSFannotator to write little scripts to do this. Another thought: if it turned out there was a substantial use case for very simple renaming of types/ features, that could be very efficiently supported by the framework if we added support for aliases type specifications- this would be a special kind of type definition that "mapped" to another one. However, as I've suggested above, I don't think that this kind of thing would cover enough of the real use cases to be worth the additional complexity. -Marshall > Jörn > > On 7/20/11 9:20 PM, Jörn Kottmann wrote: >> My point is that a user defines his own type system, and a mapping which >> translates parts >> of this type system to the annotator type system. >> >> So in the sample above a user defines this type system: >> >> Type: com.foo.Token >> Feature: double tokenConfidence >> Feature: String posTag >> Feature: double posConfidence >> >> The tokenizer also defined its type system: >> Type: opennlp.Token >> Feature: float confidence >> >> And one more type system for the pos tagger: >> Type: opennlp.POSToken >> Feature: float confidence >> Feature: String tag >> >> The user defined AAE only knows the user type system and needs to >> define "rules" which tell it how to transform opennlp.Token annotations >> to com.foo.Token annotations, and then it needs a rule to transform >> a com.foo.Token into an opennlp.POSToken, and back. >> >> Sure this is also already possible today, by writing these type mapping AEs, >> as you would need to do for JCas. But I think having better framework support >> for this would make it easier. > >