As part of my GSoC project for apertium-eng-cat, I have trained the English
tagger using supervised perceptron training. This has improved the tagger
accuracy from 74% to around 90%. Before GSoC ends in three weeks I will
update the wiki with details of the entire training process so it can be
retrained easily.
Two important things:
1. If you are the developer of a pair relying on apertium-eng, you should
call the tagger in the pipe as "apertium-tagger -gx eng.prob" (instead of
"apertium-tagger -g eng.prob").
2. The new tagger is not perfect, and may have introduced errors that did
not happen before. Ideally new restrictions should be added using an MTX
file, but meanwhile we can solve them using CG. Feel free to contribute!
Thanks!
Marc
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff