On Wed, Sep 15, 2010 at 11:29 AM, Nguyen, Vincent (CDC/OSELS/PHITPO)
(CTR) <v...@cdc.gov> wrote:
> I was running a query on the word "mining" and got results from
> documents that have nothing to do with mining.  I got results with a
> score of 0.2997284 and less.  It looks like Solr was querying the
> dsm.fulltext field for "mine" as well, which is ok except there were no
> "mine" words in the document.  However, I did find words like
> "exaMINEd".

Was the "MINE" in "exaMINEd" actually uppercase, or did you do that
for emphasis?

If it was actually uppercased, one could argue it is a relevant
document since someone was trying to get "MINE" to stand out for some
reason.

Anyway, if you don't want that behavior then turn off splitting on case change.
splitOnCaseChange="0" in WordDelimiterFilterFactory
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

-Yonik
http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8

Reply via email to