Thanks Robert, exactly what I was looking for.

-----Original Message-----
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Wednesday, October 19, 2011 1:15 PM
To: solr-user@lucene.apache.org
Subject: Re: stemEnglishPossessive and contractions

The word delimiter filter also does other things, it treats ' as punctuation by 
default. So it normally splits on ', except if its 's (in this case it removes 
the 's completely if you use this stemEnglishPossessive).

There are a couple approaches you can use:
1. you can keep worddelimiterfilter with this option on, but disabling 
splitting on ' by customize its type table. in this case specify 
types=mycustomtypes.txt, and in that file specify ' to be treated as ALPHANUM 
or similar. see
https://issues.apache.org/jira/browse/SOLR-2059 for some examples of this. i 
would only do this if you want worddelimiterfilter for other purposes, if you 
just want to remove possessives and don't need worddelimiterfilter's other 
features, look below.
2. you can instead use EnglishPossessiveFilterFactory, which only does this 
exact thing (remove 's) and nothing else.

On Wed, Oct 19, 2011 at 5:30 PM, Herman Kiefus <herm...@angieslist.com> wrote:
> We utilize a comprehensive dictionary of English words, place names, 
> surnames, male and female first names, ... you get the point.  As such, the 
> possessive plural forms of these words are recognized as 'misspelled'.
>
> I simply thought that 'turning on' this option for the WordDelimiterFactory 
> would address my concerns; however, I also got an unintended consequence: 
> Contractions (isn't, wouldn't, shouldn't, he'll, we'll...) also seem to be 
> affected.  Is this intended behavior?  When I read 'English possessive' I 
> hear 'apostrophe s' and not 'apostrophe anything'.  Is there something I'm 
> missing here?
>



--
lucidimagination.com

Reply via email to