Re: "runs" vs. "running" - Query time vs Index Time stemming

2008-01-24 Thread Ryan McKinley


protected="protwords.txt"/>


isn't that what protwords.txt does?



"runs" vs. "running" - Query time vs Index Time stemming

2008-01-24 Thread Matthew Runo

Hello folks..

I'm seeing something that makes total sense to me, but the pointy  
haired bosses don't like it, so I've gotta come up with a solution. We  
search a pretty standard product catalog, and due to stemming a search  
for "running shoes" matches things with "Runs 1/2 a size large" in the  
product description. I've tried tweaking the Query / Index time  
settings, below, but I still get the stemming. Any ideas on how I can  
make "running" not match "runs" in product descriptions, while still  
keeping the words "run", "runs", "running"... searchable in the  
product descriptions (just not stemming on them).


Here's my field config...

positionIncrementGap="100">



synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
ignoreCase="true" words="stopwords.txt"/>
generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1"  
catenateAll="0" splitOnCaseChange="1"/>
protected="protwords.txt"/>
class="solr.RemoveDuplicatesTokenFilterFactory"/>





ignoreCase="true" words="stopwords.txt"/>
generateWordParts="0" generateNumberParts="1"
catenateWords="0" catenateNumbers="0"  
catenateAll="0" splitOnCaseChange="1"/>
protected="protwords.txt"/>
class="solr.RemoveDuplicatesTokenFilterFactory"/>





I don't think I can use stopwords, because I need to be able to search  
on all of these words, just not match "runs" when they search  
"running". In most cases the other stemming is fine, and if possible  
I'd like to not completely turn it off. That is, however, an option.  
It seems to be a solvable problem though - any ideas would be greatly  
appreciated.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833