Confusion matrix report for POS Tagger evaluators

[email protected] Sat, 25 Feb 2012 14:58:24 -0800

Hi,

I implemented a new EvaluationMonitor for the POS Tagger. It generates
a confusion
matrix <http://en.wikipedia.org/wiki/Confusion_matrix> for each token that
was not tagged properly.


Example output (Portuguese):

...
Accuracy for [que]: 91,34%
1316 ocurrencies. Confusion matrix (line: reference; column: predicted):
           |    conj-s | pron-indp |       adv |  pron-det || % Accu ||
    conj-s |>     537 <|       40  |        0  |        0  || 93,07% ||
 pron-indp |       59  |>     661 <|        0  |        0  || 91,81% ||
       adv |        2  |       12  |>       4 <|        0  || 22,22% ||
  pron-det |        0  |        1  |        0  |>       0 <||     0% ||

Accuracy for [o]: 98,48%
3949 ocurrencies. Confusion matrix (line: reference; column: predicted):
           |       art |  pron-det | pron-pers |         , || % Accu ||
       art |>    3857 <|        4  |        0  |        1  || 99,87% ||
  pron-det |       36  |>      24 <|        0  |        0  ||    40% ||
 pron-pers |       19  |        0  |>       8 <|        0  || 29,63% ||
         , |        0  |        0  |        0  |>       0 <||     0% ||

Accuracy for [a]: 96%
4395 ocurrencies. Confusion matrix (line: reference; column: predicted):
           |       art |       prp | pron-pers |  pron-det || % Accu ||
       art |>    3291 <|       54  |        0  |        0  || 98,39% ||
       prp |      107  |>     922 <|        0  |        0  ||  89,6% ||
 pron-pers |        4  |        0  |>       4 <|        0  ||    50% ||
  pron-det |       11  |        0  |        0  |>       2 <|| 15,38% ||
...

Do you think it is interesting to make this report available?
I would add it to the CLI and it would be activated by an new argument that
pass in an output file for the report.

Thank you,
William

Confusion matrix report for POS Tagger evaluators

Reply via email to