Hi again,

the Spanish and Dutch NER models are also affected, was just a bit more 
difficult to figure out because the models internally lower-case the features.

Cheers,

-- Richard

> On 01.03.2016, at 23:13, Richard Eckart de Castilho <[email protected]> wrote:
> 
> Hi all,
> 
> I noticed that the OpenNLP German POS Tagger maxent model available from 
> Sourceforge has been trained using the wrong encoding setting. Apparently the 
> input data was UTF-8, but it was read as ISO8859-1. The perceptron model is 
> not affected. I only examined NER and POS models, not tokenizer or sentence 
> splitter models.
> 
> Best,
> 
> -- Richard

Reply via email to