*Results of Chunker evaluation with public data*

*Component:* Chunker

*Data:* CONLL 2000

*Tester:* colen

*Tagging Perf 1.5.0:*

*Tagging Perf 1.5.1:*
Precision: 0.9255923572240226
Recall: 0.9220610430991112
F-Measure: 0.9238233255623465

*Comment:*
ChunkerEvaluator tool was not availabe in 1.5.0. To evaluate if something
changed I compared the output of 1.5.0 and 1.5.1 in a way similar to
"Compatibility Test with OpenNLP 1.5.0 SourceForge Models". The output
changed a little because of a bug fixed in 1.5.1 (missing trailing closing
bracket)

------------------------

*Component:* Chunker

*Data:* Arvores Deitadas

*Tester:* colen

*Tagging Perf 1.5.0:*

*Tagging Perf 1.5.1:*
Precision: 0.9406086044071353
Recall: 0.9364814040952779
F-Measure: 0.9385404669668097

*Comment:*
AD format for Chunker was not available for 1.5.0

=========
Test details
=========

Conll 2000
================================================================================
1.5.1
--------------------------------------------------------------------------------
$ time ./bin/opennlp ChunkerTrainerME -lang en -encoding UTF8 -iterations
100 -cutoff 5 -data train.txt -model en-chunker.bin
real    4m39.469s

--------
$ time ./bin/opennlp ChunkerEvaluator -encoding UTF8 -data test.txt -model
en-chunker.bin
Average: 161,7 sent/s
Total: 2013 sent
Runtime: 12.446s

Precision: 0.9255923572240226
Recall: 0.9220610430991112
F-Measure: 0.9238233255623465

real    0m13.356s

--------
$ time ./bin/opennlp ChunkerME en-chunker.bin < test_pos.txt > output.txt
Loading Chunker model ... done (0,650s)

Average: 167,3 sent/s
Total: 2012 sent
Runtime: 12.024s

real    0m12.906s


1.5.0
--------------------------------------------------------------------------------
$ time ./bin/opennlp ChunkerTrainerME -lang en -encoding UTF8 -iterations
100 -cutoff 5 -data ../apache-opennlp/train.txt -model en-chunker.bin
real    5m12.107s

--------
$ time ./bin/opennlp ChunkerME en-chunker.bin <
../apache-opennlp/test_pos.txt > output.txt
Loading Chunker model ... done (0,649s)

Average: 169,5 sent/s
Total: 2012 sent
Runtime: 11.869s

real    0m12.752s

Arvores Deitadas
================================================================================

1.5.1
--------------------------------------------------------------------------------
$ bin/opennlp ChunkerConverter ad -encoding ISO-8859-1 -data
../wrk/corpus/Bosque_CF_8.0.ad.txt > bosque-chunk
$ time ./bin/opennlp ChunkerTrainerME -lang pt -encoding UTF8 -iterations
100 -cutoff 5 -data bosque-chunk_train.txt -model pt-chunker.bin

real    0m56.778s

--------
$ time ./bin/opennlp ChunkerEvaluator -encoding UTF8 -data
bosque-chunk_test.txt -model pt-chunker.bin
Loading Chunker model ... done (0,245s)
Average: 145,5 sent/s
Total: 411 sent
Runtime: 2.825s

Precision: 0.9406086044071353
Recall: 0.9364814040952779
F-Measure: 0.9385404669668097

real    0m3.332s

Reply via email to