Yes, I think that would be useful. We could create a command line
reporter which can print these statistics, at least to start with.
Can that be done with the new listener interface we just created for
our evaluators? If not I suggest that we might rename it, and add also
a method for correctly classified samples to it, or indicate that with a
flag.
Jörn
On 8/17/11 6:51 PM, [email protected] wrote:
Hi,
Would it be useful to have detailed output from FMeasure while using span
with types? For example, we should use it to know individual precision and
recall for person, organization, date in a NameFinder model or for Chunker.
Something the output from
CONLL2000<http://www.cnts.ua.ac.be/conll2000/chunking/output.html>
:
processed 961 tokens with 459 phrases; found: 539 phrases; correct: 371.
accuracy: 84.08%; precision: 68.83%; recall: 80.83%; FB1: 74.35
ADJP: precision: 0.00%; recall: 0.00%; FB1: 0.00
ADVP: precision: 45.45%; recall: 62.50%; FB1: 52.63
NP: precision: 64.98%; recall: 78.63%; FB1: 71.16
PP: precision: 83.18%; recall: 98.89%; FB1: 90.36
SBAR: precision: 66.67%; recall: 33.33%; FB1: 44.44
VP: precision: 69.00%; recall: 79.31%; FB1: 73.80
I will need something like that for my master dissertation. If it is useful
I would add it to OpenNLP.
Thanks,
William