Re: POSTagger Perceptron API

Jörn Kottmann Thu, 12 Jan 2012 06:57:34 -0800

On 1/12/12 3:42 PM, Svetoslav Marinov wrote:

Hi all,


There is a Perceptron model for Swedish POS tagger. How does one call it with 
the API? I checked the API pages as well as the documentation but there there 
is only reference to the MaxEnt model:

POSTaggerME tagger  = new POSTaggerME(model);

So what is the method for using the Perceptron model?


The decision is made at training time, depending on the settings either
maxent or perceptron is used to train a model. The produced model can
be loaded with the code above and OpenNLP takes care to setup
everything behind the scene correctly.

We distribute a perceptron model for English.

For information about how to set the training algorithm please consult
our documentation:
http://incubator.apache.org/opennlp/documentation/1.5.2-incubating/manual/opennlp.html#tools.postagger.training

I am also curious about the performance of the trained models. Is there any 
reference to precision/recall? Can one get in touch with the people who have 
trained the models available?

If one creates a new model (say for sentence detection or POS tagging with 
different set of POS tags) can one upload it?

We currently don't have a way to share models or take care for thedistribution, mostly for copyright/legal issues.

The way we think it should be fixed is to share open source training data.

Anyway, we have some instructions no how to train the POS tagger onvarious public corpora in our documentation.

I suggest that you take a look there:
http://incubator.apache.org/opennlp/documentation/1.5.2-incubating/manual/opennlp.html#tools.corpora

Hope that helps,
Jörn

Re: POSTagger Perceptron API

Reply via email to