On 7/20/11 9:44 PM, Richard Eckart de Castilho wrote:
In DKPro we have a Token type and a POS from which several types inherit (V,
NP, ADJ, etc.) The Token type has a feature of type POS on which we set an
instance e.g. V or NP.
Token t = new Token(jcas);
t.setPos(new N(jcas));
We find this quite convenient because it allows us to easily select particular
type from the CAS, e.g.
for (N noun : select(jcas, N.class)) {
... do something with nouns ...
}
Similarly it's convenient to write rules over POS tags in TextMarker with our
type system.
With such type systems or with type systems using lists, arrays etc, a simple
rule-based mapping won't work I think. JCas is a nice convenience API, but I
don't think its more. I'm not sure if the effort of implementing a mapping rule
framework is worth the outcome.
Well, in most cases you probably just need to map type names, and
features. The only issue I see here is
that many annotators need to access the covered text, but that could be
a new (and different named)
"virtual" feature of an annotation.
I only used types to define what kind of information I need to
store/exchange, and did not abuse
type names itself to encode information. I also don't think that works
well when you start generating a lot of types.
Sure, with growing type system complexity the issue of integrating
different components gets worse.
I actually kind of like our solution for the flow controllers, there we
define two standard cases, and if a user
needs something complex he can put it in his own implementation.
Jörn