On 03/05/12 12:16, Jörn Kottmann wrote:
On 05/03/2012 10:58 AM, Jim - FooBar(); wrote:
I can also provide the "AggregateNameFinder" class which takes any
number of name-finders and merges their results in order to get
better evaluation statistics. Internally, it uses the
"NameFinderME.dropOverlappingSpans()" method to get rid of nested
spans, which however does the simplistic thing of keeping the
earliest span (ignoring the type of the span completely). I think
being able to merge results from several name-finders is a killer
feature that a lot of people will appreciate even if i don't think
keeping the earliest span is sensible when trying to evaluate several
finders on multiple entity types...
+1 to implement it based on NameFinderME.dropOverlappingSpans.
In my opinion that is still a good baseline. We can come up with more
specialized and sophisticated
approaches e.g. based on probabilities and limited for statistical
name finders.
Jörn
Yes, I agree it is not a bad baseline, but pretty soon we'll have to
either look at the probabilities (if someone is trying to merge several
models) or at the actual class of the namefinder that gave a particular
prediction and reason on that...for example if a prediction came from a
dictionary there is really no point in doubting it is there? It must be
correct! anyway, i'd love to see this feature on 1.5.3 and a couple of
weeks (what William needs) is not that long...
Jim
ps: btw, I 've been actually using the aggregate name-finder in my
private build for almost 3 weeks now...I'm passing it 2 dictionary
finders of different types and a maxent model that can also predict 2
types. Everything works just fine! :)