Fuzzy search result ranking
---------------------------
Key: LUCENE-2256
URL: https://issues.apache.org/jira/browse/LUCENE-2256
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Affects Versions: 3.0
Environment: all
Reporter: Mike Cawson
When a search term is expanded into a set of alternatives (using Fuzzy, Range,
Prefix or Wildcard queries), the user really wants documents that have any one
of the alternatives (ideally the exact one typed). She is not asking for the
document that contains the maximum number of different alternatives, but that
is how the scoring works.
The problem is that the SHOULD directive doesn't implement an OR between
alternatives but an AND/OR.
frederick~ alderwood~ expands to something like:
(frederick frederich^0.9 fredereck^0.9) (alderwood elderwood^0.9 underwood^0.8)
A document containing frederick, frederich and fredereck would score more
highly than one with the exact search terms, frederick and alderwood, yet it
only satisfies one of the user's two query terms.
The problem is not the same as issue 329 but is caused by the scores for all of
the expanded terms being summed. What is required is the maximum score for any
of the alternatives for each term, summed across all terms.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]