[ 
https://issues.apache.org/jira/browse/LUCENE-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642615#action_12642615
 ] 

Mark Harwood commented on LUCENE-1424:
--------------------------------------

>> Are the score differences caused by the rewrite-to-BooleanQuery 
>> implementations ever "useful"?

So we need to consider what we are losing - TF, IDF, coordination, length norm, 
doc boosts.

I can only think of one use case which relates to coordination factor.

If you have a "category" field for a product e.g. given Lucene docs for these 
books:

Title:            Lucene in Action
Category:   /Books/Computing/Languages/Java
                    /Books/Computing/InformationRetrieval

Title:           The Long Tail
Category:  /Books/Business/Internet
                   /Books/Computing

You might then use a wildcard search of /Books/Computing/* and "Lucene in 
Action" would rank higher than "The Long Tail" because a BooleanQuery would 
score a higher coordination factor suggesting LIA got more hits under this 
"/Books/Computing.." category. There would still be the issue of IDF 
potentially skewing results but the coordination factor is potentially useful 
here. 

I think in general IDF tends to be useless for "auto-expanded" terms e.g. 
Wildcard, fuzzy etc. Incidentally, we still see that IDF issue in fuzzy queries 
ranking rare mis-spellings higher but that's another issue (one I resolved in 
contrib's FuzzyLikeThisQuery).

I suppose one other consideration is for people who have created any doc boosts 
e.g. trying to use this to boost by date.

I don't think any of these cases necessarily outweigh the benefit to be 
obtained from switching "wildcard/prefix to constant score queries"


Cheers,
Mark







> Add ConstantScorePrefixQuery and ConstantScoreWildcardQuery
> -----------------------------------------------------------
>
>                 Key: LUCENE-1424
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1424
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Mark Miller
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1424.patch
>
>
> If we want to be able to highlight these queries, they need to be added to 
> Lucene core or contrib (solr's WildCardFilter can be used to create the 
> ConstantScoreWildcardQuery). They are very useful anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to