Hi, Jairo, I think you will have to perform two conversions:
1) From CONLL02 to the NameFinder format: bin/opennlp TokenNameFinderConverter conll02 -data esp.train -lang es -types per > esp_nf.train 2) From NameFinder format to SentenceDetector format: bin/opennlp SentenceDetectorConverter -data esp_nf.train -encoding <your sys encoding> -detokenizer es-detokenizer.xml You will have to create a detokenizer dictionary. Maybe the English one will work for you: http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-tools/lang/en/tokenizer/en-detokenizer.xml?view=markup *NOTE:* While trying it using the OpenNLP 1.5.2 I got the following error: $ bin/opennlp TokenNameFinderConverter conll02 -data esp.train -lang es -types per IO error while reading training data or indexing data: Expected three fields per line in training data! Is it a bug or I am doing something wrong? On Wed, Feb 8, 2012 at 2:18 PM, Jairo Sarabia <[email protected]>wrote: > Hello, > > Forgive my ignorance but, how is it done? > > Thank you!, > > Jairo > > 2012/2/7 Joern Kottmann <[email protected]> > > > Hello, > > > > sorry we don't offer a model currently. But with the new tooling > > it should be fairly easy to train one on the CONLL02 data. > > > > Hope that helps, > > Jörn > > > > On Mon, Feb 6, 2012 at 5:55 PM, Jairo Sarabia > > <[email protected]>wrote: > > > > > Hello all!, > > > > > > I'm interested in the extraction of data from the Spanish DBpedia dumps > > and > > > need a Spanish sentence detector. I have seen that there is no model > for > > > opennlp 1.5. How I can obtain a model for this? > > > It is important that is for the version 1.5 > > > > > > Thanks in advance!, > > > > > > Jairo Sarabia > > > > > >
