Hi all, I noticed that the OpenNLP German POS Tagger maxent model available from Sourceforge has been trained using the wrong encoding setting. Apparently the input data was UTF-8, but it was read as ISO8859-1. The perceptron model is not affected. I only examined NER and POS models, not tokenizer or sentence splitter models.
Best, -- Richard
