Hi,

I made a change to the BaseTagger. From the change log:

-The part-of-speech tagger for most languages can now be extended by 
adding entries
  to the file org/languagetool/resource/XX/added.txt (XX being the 
language code).
  The format is "fullform baseform postags", three columns separated by 
tabs.
  This makes it easier for users (and developers) to extend the POS 
tagger, as they
  don't need to export, modify, and re-create the binary dictionary for 
every change.

The languages that cannot yet use added.txt are:
Chinese, Japanese, Esperanto - these don't extend BaseTagger
Romanian, Catalan - these have their own way of using a Manual tagger, 
it would be great if the maintainers can check if using BaseTagger for 
that is feasible (GermanTagger already does that): call getWordTagger() 
and you'll get a tagger that merges results from binary dictionary and 
added.txt, assuming you return a path to added.txt in 
getManualAdditionsFileName().

Regards
  Daniel


------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to