Re: de pluralization

Andrew Boyd Fri, 05 Aug 2005 06:45:16 -0700

You might want to look at stemming for "de pluralization"  it boils down words 
to their "root"


So bombs and bomming get stemmed to bomb.

I'm using the snowball stemmer, which handles different languages as well as 
english.
It is in the sandbox.  
org.apache.lucene.analysis.snowball.SnowballFilter;

Hope this helps,

Andrew

-----Original Message-----
From: Dan Armbrust <[EMAIL PROTECTED]>
Sent: Aug 5, 2005 8:25 AM
To: java-user@lucene.apache.org
Subject: Re: de pluralization

Mufaddal Khumri wrote:

>Are there
>analyzers that do this already?
>
>  
>
Its not an analyzer, but the "norm" feature of this tool does a good job 
at getting to the normalized form of the words...

http://umlslex.nlm.nih.gov/lvg/current/

http://umlslex.nlm.nih.gov/lvg/current/docs/userDoc/norm.html

Creating an analyzer from it is fairly straightforward.


-- 
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: de pluralization

Reply via email to