I think the root of all our problems here is the fact that we're trying
to generalise something that was not intended for that purpose. If you
remember, before writing the AggregateNameFinder i had similar code in
the TokeNameFinderEvaluator class. After your suggestion i moved it to
the name-find package but i had strong objections for doing so because
the merging of results should be an evaluation issue only. Now however
we've dug ourselves a hole...now, we're saying to the user "you can use
the aggregate finder instead of the individual ones but we will decide
on some things for you!" ...see? it is no longer an evaluation issue -
it has become a separate name-finder that people might use for
annotating their corpus which can lead to strange behaviour if we don't
resolve nested tags. We tried to improve the evaluation and we've ended
up discussing hard-coded rules as to how such a name-finder must behave...
Jim
On 17/04/12 16:13, Jörn Kottmann wrote:
On 04/17/2012 05:00 PM, Jim - FooBar(); wrote:
Could you name a few of the "applications" that cannot deal with
overlapping/intersecting spans ???
- UIs are easier to make when they assume they are not overlapping
- Inserting overlapping names in a Parse tree is not possible
- Annotation guidelines often say they should not overlap and
therefore you
don't want to produce overlapping names
- Data formats which cannot represent overlapping annotations
Anyway the point is that a user might assume that names do not overlap
and then
they write code which cannot deal with this case. Therefore they need
to resolve
overlapping names in some way.
Jörn