Stemming, Stoplists and Language Models?

Martin Wunderlich Sun, 23 Feb 2014 05:36:07 -0800

Hi all, 

I recently started working with OpenNLP for a project in the area of text 
classification with neural networks. So far, OpenNLP is a great library and 
very useful. 
There are just three things that I haven't been able to find, but maybe they do 
exist: 
- language models: e.g. to create a bigram language model with relative and 
absolute frequencies from several texts 
- stemming: to reduce different word forms in inflected languages to a 
canonical root form
- stoplist: to remove certain words (e.g. from the language model) that are 
deemed irrelevant


Do these functions exist in OpenNLP? If not, can you recommend another library 
to complement these functions? 

Kind regards, 

Martin

signature.asc
Description: Message signed with OpenPGP using GPGMail

Stemming, Stoplists and Language Models?

Reply via email to