On Wed, Sep 15, 2010 at 11:29 AM, Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR) <v...@cdc.gov> wrote: > I was running a query on the word "mining" and got results from > documents that have nothing to do with mining. I got results with a > score of 0.2997284 and less. It looks like Solr was querying the > dsm.fulltext field for "mine" as well, which is ok except there were no > "mine" words in the document. However, I did find words like > "exaMINEd".
Was the "MINE" in "exaMINEd" actually uppercase, or did you do that for emphasis? If it was actually uppercased, one could argue it is a relevant document since someone was trying to get "MINE" to stand out for some reason. Anyway, if you don't want that behavior then turn off splitting on case change. splitOnCaseChange="0" in WordDelimiterFilterFactory http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8