OpenNLP maxent model trained with wrong encoding

Richard Eckart de Castilho Tue, 01 Mar 2016 14:19:00 -0800

Hi all,

I noticed that the OpenNLP German POS Tagger maxent model available from 
Sourceforge has been trained using the wrong encoding setting. Apparently the 
input data was UTF-8, but it was read as ISO8859-1. The perceptron model is not 
affected. I only examined NER and POS models, not tokenizer or sentence 
splitter models.


Best,

-- Richard

OpenNLP maxent model trained with wrong encoding

Reply via email to