Looks like PorterStemFilter converts "Lowe's" to low. Not very surprising.
Options include . Drop the stemming . Index stemmed and non-stemmed variants and search both, maybe boosting the non-stemmed variant. If you really want exact matches only, you may also/instead want untokenized fields. Apostrophes etc can be a problem. Look into what analyzers do and use Luke to see what is indexed. -- Ian. On Fri, Jan 8, 2010 at 8:01 PM, Jamie <ja...@stimulussoft.com> wrote: > Hi There > > We are trying to search for the exact word "Lowe's" across a large set of > indexed data. Our results include everything with "low" in it. Thus, we are > receiving a much larger data set that we expected. The data is indexing > using the analyzer: > TokenStream result = new StandardTokenizer(reader); > result = new StandardFilter(result); > result = new LowerCaseFilter(result); > result = new StopFilter(result, stopTable); > result = new PorterStemFilter(result); > return result; > > Thanks > > Jamie > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org