Hi You have to do these 4 phases in order, because lemmatizations needs tokens + their part-of-speech to do the process
sentence-detection tokenization pos-tagging lemmatization Theoretically it is possible to do lemmatization using opennlp model and other phases in a different way (some hard-coded algorithm?), but I think the simplest way is to use 4 opennlp models if they are already precomputed. Regards Leszek Od: "T. Kuro Kurosaka" <k...@bhlab.com> Do: users@opennlp.apache.org; lesze...@interia.eu; Wysłane: 20:53 Wtorek 2023-01-10 Temat: Re: Portuguese lemmatization model? > Thank you, Leszek! > It looks promising. It did lemmatize "azuis" -> "azul". > Are these 4 char filters absolutely required to run the lemmatizers correctly ? > > Kuro > > On 1/10/23 12:45 AM, lesze...@interia.eu wrote: > > Hi > > > > As far as I know there is no portugese lemmatizer on official opennlp site. > > In general such models are not easily available, at least for less popular languages. > > > > I developed an application to automatically compute sentence-detector, tokenizer, pos-tagger and lemmatizer from Universal Dependencies language files. > > For now models are generated for 19 languages (including portugese). > > > > Main app: https://github.com/abzif/babzel > > Pre-trained models: https://abzif.github.io/babzel/models.html > > > > Enjoy! > > Leszek Piotrowicz > > > > Od: "T. Kuro Kurosaka" > > Do: users@opennlp.apache.org; > > Wysłane: 2:29 Wtorek 2023-01-10 > > Temat: Portuguese lemmatization model? > > > >> Is there a pre-trained lemmatization model for Portuguese > > and other popular > >> languages? > >> > >> -- > >> T. "Kuro" Kurosaka, Orinda, California, USA > >> > >> > > > > > > -- > T. "Kuro" Kurosaka, Orinda, California, USA > >