Re: Hybrid POS tagger

Jörn Kottmann Tue, 15 Mar 2011 03:45:01 -0700

On 3/13/11 12:54 AM, Radu Simionescu wrote:

Hello


I am making paper a pos tagger for Romanian for my disertation. I want  to be
able to restrict the outcomes even more than just using a  dictionary. I want to
use some rules for disambiguation, based on the  context. This would allow me to
use smaller corpus, and also to fix  consistent output mistakes.

So I want to be able to give the postagger the possible set of outcomes  for
each word from the input, separately. So, since the training of a  model doesn't
really use the pos dictionary, I figured I could make this parser by  making
small modifications to the API, because the dictionary can change from one
sentence/word to the other. Please let me know if I am wrong.

There is no out-of-the-box support for this, but I believe it should beeasy to implement,all you need to do is to write a custom sequence validator which doeswhat you described

above.

Just have a look at the POSTaggerME class, you need to modify theconstructor to give ita custom fetaure generator. We should open a jira issue and extend ourAPI to pass-in

a sequence validator object.

Jörn

Re: Hybrid POS tagger

Reply via email to