I am new to openNLP. I have the basic java code running.
I want to create a training set for twitter topics. I have the Training API
page with the sample code, but I cannot comprehend from that how to create
and modify the training set.
Can anyone help?
POSModel model = null;
InputStream dataIn = null;
try {
dataIn = new FileInputStream("en-pos.train");
ObjectStream<String> lineStream =
new PlainTextByLineStream(dataIn, "UTF-8");
ObjectStream<POSSample> sampleStream = new
WordTagSampleStream(lineStream);
model = POSTaggerME.train("en", sampleStream, ModelType.MAXENT,
null, null, 100, 5);
}
catch (IOException e) {
// Failed to read or parse training data, training failed
e.printStackTrace();
}
finally {
if (dataIn != null) {
try {
dataIn.close();
}
catch (IOException e) {
// Not an issue, training already finished.
// The exception should be logged and investigated
// if part of a production system.
e.printStackTrace();
}
}
}
----- Original Message -----
From: "Massimo Tarantelli" <[email protected]>
To: <[email protected]>
Sent: Friday, October 25, 2013 11:47 AM
Subject: Document categorizer model
Dear all,
does anyone has trained a Document categorizer model in english?
thanks
--
*Massimo Tarantelli*
_Innovation Engineering_
Via Napoleone Colajanni 4 (00191 Roma)
T +39 06 45 425 111
E [email protected]