Hi, I made a change to the BaseTagger. From the change log:
-The part-of-speech tagger for most languages can now be extended by adding entries to the file org/languagetool/resource/XX/added.txt (XX being the language code). The format is "fullform baseform postags", three columns separated by tabs. This makes it easier for users (and developers) to extend the POS tagger, as they don't need to export, modify, and re-create the binary dictionary for every change. The languages that cannot yet use added.txt are: Chinese, Japanese, Esperanto - these don't extend BaseTagger Romanian, Catalan - these have their own way of using a Manual tagger, it would be great if the maintainers can check if using BaseTagger for that is feasible (GermanTagger already does that): call getWordTagger() and you'll get a tagger that merges results from binary dictionary and added.txt, assuming you return a path to added.txt in getManualAdditionsFileName(). Regards Daniel ------------------------------------------------------------------------------ Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel