I am new to openNLP. I have the basic java code running.

I want to create a training set for twitter topics. I have the Training API page with the sample code, but I cannot comprehend from that how to create and modify the training set.

Can anyone help?

POSModel model = null;

InputStream dataIn = null;
try {
 dataIn = new FileInputStream("en-pos.train");
 ObjectStream<String> lineStream =
                new PlainTextByLineStream(dataIn, "UTF-8");
ObjectStream<POSSample> sampleStream = new WordTagSampleStream(lineStream);

 model = POSTaggerME.train("en", sampleStream, ModelType.MAXENT,
     null, null, 100, 5);
}
catch (IOException e) {
 // Failed to read or parse training data, training failed
 e.printStackTrace();
}
finally {
 if (dataIn != null) {
   try {
     dataIn.close();
   }
   catch (IOException e) {
     // Not an issue, training already finished.
     // The exception should be logged and investigated
     // if part of a production system.
     e.printStackTrace();
   }
 }
}



----- Original Message ----- From: "Massimo Tarantelli" <[email protected]>
To: <[email protected]>
Sent: Friday, October 25, 2013 11:47 AM
Subject: Document categorizer model


Dear all,
does anyone has trained a Document categorizer model in english?
thanks
--

*Massimo Tarantelli*

_Innovation Engineering_
Via Napoleone Colajanni 4 (00191 Roma)
T +39 06 45 425 111
E [email protected]




Reply via email to