On 04/19/2012 06:20 PM, Joan Codina wrote:
then with the sentences with all tokens separated by spaces y need to
merge the words adding <space> but I don't know how to make it with
the dictionaryDetokenizer
./opennlp DictionaryDetokenizer ../models/en-detokenizer.xml
<../models/CoNLL2009-ST-English-train.sent
as it merges the senteces but does not add the <space>
It should insert <SPLIT> tags for certain spaces, so the tokenizer can learn
that there is something to split. Input should be one sentence per line.
What output do you get?
Jörn