Fuzzy search result ranking
---------------------------

                 Key: LUCENE-2256
                 URL: https://issues.apache.org/jira/browse/LUCENE-2256
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
    Affects Versions: 3.0
         Environment: all
            Reporter: Mike Cawson


When a search term is expanded into a set of alternatives (using Fuzzy, Range, 
Prefix or Wildcard queries), the user really wants documents that have any one 
of the alternatives (ideally the exact one typed). She is not asking for the 
document that contains the maximum number of different alternatives, but that 
is how the scoring works.

The problem is that the SHOULD directive doesn't implement an OR between 
alternatives but an AND/OR.

frederick~ alderwood~ expands to something like:
(frederick frederich^0.9 fredereck^0.9) (alderwood elderwood^0.9 underwood^0.8)

A document containing frederick, frederich and fredereck would score more 
highly than one with the exact search terms, frederick and alderwood, yet it 
only satisfies one of the user's two query terms.

The problem is not the same as issue 329 but is caused by the scores for all of 
the expanded terms being summed. What is required is the maximum score for any 
of the alternatives for each term, summed across all terms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to