Re: Merging the output of multiple name finders

Jörn Kottmann Tue, 17 Apr 2012 07:08:54 -0700

On 04/17/2012 04:00 PM, Jim - FooBar(); wrote:

On 17/04/12 13:52, Jörn Kottmann wrote:
If you don't want to handle these cases, you can simply copy allnames together
into a list, and then do evaluation on this list.
This approach works with our evaluation, but will usually be an issuefor applications which expect output
where the ambiguities mentioned earlier are resolved.
That is exactly what my current AggregateNameFinder does...It justgets rids of duplicates...
I propose that we make a simple baseline implementations
which takes all output spans, orders them and then resolves
the ambiguities based on the order. This will prefer longer
names over shorter names, but ignores the type.

There are more sophisticated ways of handling this,
e.g taking probabilities from the statistical name finders into
account, but these might be a bit more restrictive as well.
I agree on the baseline implementation but i don't see why the spansneed to be ordered and why ambiguities need resolving...the only trueambiguity that can occur is having the exact same span with adifferent type in which case we need to make a decision. Taking theprobabilities from maxent is also a bit naive because you will notknow which model to trust (maybe the weakest model gives you highestprobs)...

You can have overlapping spans, which usually always indicate aclassification mistake and cannot be handled nicelyby applications which expect non-overlapping output as a single namefinders produces.

Therefore it they should be resolved by the baseline.

Jörn

Re: Merging the output of multiple name finders

Reply via email to