I ever trained a POS tagger model for Indonesian language. I defined some
tags for Indonesian words which had some differences with English POS tags.

I also used a 'token_pair' format in sentence list. I didn't provide any
tag dictionary.

And ... that was doing great without problem. I could create an Indonesian
POS tagger model and used it to evaluate some Indonesian text as well.

Hope this can help.

--
Dhito

On Fri, Jul 27, 2012 at 2:27 PM, Alessandra Donnini <[email protected]>wrote:

> Ok I know I'm new to opennlp, and my question may be wrong, but I would
> like to understand: can anyone answer?
> thanks
> Alessandra
>
> Inizio messaggio inoltrato:
>
> > Da: Alessandra Donnini <[email protected]>
> > Data: 20 luglio 2012 17.04.27 GMT+02.00
> > A: [email protected]
> > Oggetto: Training a POS tagger model
> >
> > I would like to provide (train) a POS tagger model for italian language.
> I have some questions:
> > - may I use a token_tag pair list in place of sentence list? Something
> like:
> > casa_NOUN
> > e_CON (conjuction)
> > ...
> > in place of
> >
> > la_ART casa_NOUN e_CON la_ART strada_NOUN
> > ...
> > because I have founded an italian word list.
> >
> > - Do I need to provide a tag dictionary? Is there a default tag
> dictionary?
> >
> > thanks
> > Alessandra
> >
> >
>
>

Reply via email to