OpenNLP is mainly machine learning based, but we have the DictionaryLemmatizer with the ability to pass a dictionary of word forms. See https://opennlp.apache.org/docs/2.1.0/manual/opennlp.html#tools.lemmatizer.tagging.api. So you can use the http://github.com/LR-POR/MorphoBr that I mentioned before to prepare the input file for the DictionaryLemmatizer.
The statistical lemmatizer is also available, and that would require a model to run. You can train yourself or use one already available from the link provided by Leszek. Rodrigo Agerri made a strong claim saying that supervised lemmatizer works better. I don’t want to go into that discussion, but I believe the decision about an ML-based (supervised or not) and rule-based approach should be based on many more criteria than the performance in a single dataset. Best, Alexandre > On 13 Jan 2023, at 01:48, T. Kuro Kurosaka <k...@bhlab.com> wrote: > > I wrote "model" just because I did not know openNLP support a rule based > approach. > Are there rule file sthat I can try for Portuguese and other major languages? > > Kuro >