the general consensus among people who run into the problem you have is to use a plurals only stemmer, a synonyms file or a combination of both (for irregular nouns etc)
if you search the archives you can find info on a plurals stemmer On Mon, Jun 28, 2010 at 6:49 AM, <dar...@ontrenet.com> wrote: > Thanks for the tip. Yeah, I think the stemming confounds search results as > it stands (porter stemmer). > > I was also thinking of using my dictionary of 500,000 words with their > complete morphologies and conjugations and create a synonyms.txt to > provide english accurate morphology. > > Is this a good idea? > > Darren > >> Hi Darren, >> >> You might want to look at the KStemmer >> (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem) >> instead of the standard PorterStemmer. It essentially has a 'dictionary' >> of exception words where stemming stops if found, so in your case >> president won't be stemmed any further than president (but presidents will >> be stemmed to president). You will have to integrate it into solr >> yourself, but that's straightforward. >> >> HTH >> Brendan >> >> >> On Jun 28, 2010, at 8:04 AM, Darren Govoni wrote: >> >>> Hi, >>> It seems to me that because the stemming does not produce >>> grammatically correct stems in many of the cases, >>> search anomalies can occur like the one I am seeing where I have a >>> document with "president" in it and it is returned >>> when I search for "preside", a different word entirely. >>> >>> Is this correct or acceptable behavior? Previous discussions here on >>> stemming, I was told its ok as long as all the words reduce >>> to the same stem, but when different words reduce to the same stem it >>> seems to affect search results in a "bad way". >>> >>> Darren >> >> > >