Hello Jason, I do not have the training data in the correct format and I never took time to convert it. Another way to solve it would be to wrap the old models in our new model package.
The sentence detector and tokenizer can now also be trained on the conll data. Should we do that instead? To train the tokenizer we need a detokenizer dictionary. Jörn On 5/13/11 10:33 PM, Jason Baldridge wrote:
It seems as though the Spanish models for tokenization and sentence splitting are no longer around, e.g. the models download page only has NER models: http://opennlp.sourceforge.net/models-1.5/ But there were models before: http://opennlp.sourceforge.net/models-1.3/spanish/ Anyone know what happened to them? Sorry if I'm forgetting something... Jason
