+1 Fine-grained error analysis FTW! On Sat, Feb 25, 2012 at 4:57 PM, [email protected] < [email protected]> wrote:
> Hi, > > I implemented a new EvaluationMonitor for the POS Tagger. It generates > a confusion > matrix <http://en.wikipedia.org/wiki/Confusion_matrix> for each token that > was not tagged properly. > > Example output (Portuguese): > > ... > Accuracy for [que]: 91,34% > 1316 ocurrencies. Confusion matrix (line: reference; column: predicted): > | conj-s | pron-indp | adv | pron-det || % Accu || > conj-s |> 537 <| 40 | 0 | 0 || 93,07% || > pron-indp | 59 |> 661 <| 0 | 0 || 91,81% || > adv | 2 | 12 |> 4 <| 0 || 22,22% || > pron-det | 0 | 1 | 0 |> 0 <|| 0% || > > Accuracy for [o]: 98,48% > 3949 ocurrencies. Confusion matrix (line: reference; column: predicted): > | art | pron-det | pron-pers | , || % Accu || > art |> 3857 <| 4 | 0 | 1 || 99,87% || > pron-det | 36 |> 24 <| 0 | 0 || 40% || > pron-pers | 19 | 0 |> 8 <| 0 || 29,63% || > , | 0 | 0 | 0 |> 0 <|| 0% || > > Accuracy for [a]: 96% > 4395 ocurrencies. Confusion matrix (line: reference; column: predicted): > | art | prp | pron-pers | pron-det || % Accu || > art |> 3291 <| 54 | 0 | 0 || 98,39% || > prp | 107 |> 922 <| 0 | 0 || 89,6% || > pron-pers | 4 | 0 |> 4 <| 0 || 50% || > pron-det | 11 | 0 | 0 |> 2 <|| 15,38% || > ... > > Do you think it is interesting to make this report available? > I would add it to the CLI and it would be activated by an new argument that > pass in an output file for the report. > > Thank you, > William > -- Jason Baldridge Associate Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com http://twitter.com/jasonbaldridge
