Hi, there are subtle differences between the generic word tokenizer and the English one. For example, the English one doesn't use these characters as delimiters:
« » — < > \r Does anybody know a good reason for these differences? The svn changelog does not look like the changes made to the EnglishWordTokenizer.java are specific to English. Regards Daniel -- http://www.danielnaber.de ------------------------------------------------------------------------------ Keep yourself connected to Go Parallel: BUILD Helping you discover the best ways to construct your parallel projects. http://goparallel.sourceforge.net _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel