Re: abbreviation diccionary format

Jörn Kottmann Thu, 19 Apr 2012 10:12:18 -0700

On 04/19/2012 06:20 PM, Joan Codina wrote:

then with the sentences with all tokens separated by spaces y need tomerge the words adding <space> but I don't know how to make it withthe dictionaryDetokenizer./opennlp DictionaryDetokenizer ../models/en-detokenizer.xml<../models/CoNLL2009-ST-English-train.sent
as it merges the senteces but does not add the <space>


It should insert <SPLIT> tags for certain spaces, so the tokenizer can learn
that there is something to split. Input should be one sentence per line.

What output do you get?

Jörn

Re: abbreviation diccionary format

Reply via email to