Hello, try to use a newer version of OpenNLP, if I remeber correctly this was done for 1.5.2, otherwise have a look at the trunk version.
Jörn On 08/22/2012 03:34 PM, Duygu aralıoğlu wrote:
I need to obtain training data for turkish to use in Sentence Detector Training to get tr-sent.bin, which will be later used in both opennlp and wikipediaminer. I have downloaded corpora for turkish from http://corpora.uni-leipzig.de/download.html. Then used the command: $ bin/opennlp DoccatConverter leipzig -lang tr -data Leipzig/tr100k/sentences.txt >> lang.train However, there is no DoccatConverter TOOL. How can I obtain the train data from sentences.txt? Btw, I am working with opennlp-1.5.0 Thanks in advance... Duygu
