You might want to look at stemming for "de pluralization" it boils down words to their "root"
So bombs and bomming get stemmed to bomb. I'm using the snowball stemmer, which handles different languages as well as english. It is in the sandbox. org.apache.lucene.analysis.snowball.SnowballFilter; Hope this helps, Andrew -----Original Message----- From: Dan Armbrust <[EMAIL PROTECTED]> Sent: Aug 5, 2005 8:25 AM To: java-user@lucene.apache.org Subject: Re: de pluralization Mufaddal Khumri wrote: >Are there >analyzers that do this already? > > > Its not an analyzer, but the "norm" feature of this tool does a good job at getting to the normalized form of the words... http://umlslex.nlm.nih.gov/lvg/current/ http://umlslex.nlm.nih.gov/lvg/current/docs/userDoc/norm.html Creating an analyzer from it is fairly straightforward. -- **************************** Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]