[ 
https://issues.apache.org/jira/browse/LUCENE-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4386:
----------------------------------

    Fix Version/s:     (was: 4.3)
                   4.4
    
> Query parser should generate FieldValueFilter for pure wildcard terms to 
> boost query performance
> ------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-4386
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4386
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/queryparser
>    Affects Versions: 4.0-BETA
>            Reporter: Jack Krupansky
>             Fix For: 4.4
>
>
> In theory, a simple pure wildcard query (a single asterisk) is an inefficient 
> way to select all documents that have any value in a field. Rather than users 
> having to work around this issue by adding a separate boolean "has" field, it 
> would be better to have the query parser directly generate the most efficient 
> Lucene query for detecting all documents that have any value for a specified 
> field. According to the discussion over on LUCENE-4376, the FieldValueFilter 
> is the proper solution.
> Proposed solution:
> QueryParserBase.getPrefixQuery could detect when the query is a pure wildcard 
> (a single asterisk) and then generate a FieldValueFilter instead of a 
> PrefixQuery. My understanding from LUCENE-4376 is that the following would 
> work:
> {code}
> new ConstantScoreQuery(new FieldValueFilter(fieldname, false))
> {code}
> Oh, and the check for whether "leading wildcard" is enabled would need to be 
> bypassed for this case.
> I still think it would be better to have PrefixQuery perform this 
> optimization internally so that all apps would benefit, but this should be 
> sufficient to address the main concern.
> This improvement would improve the classic Lucene query parser and other 
> query parsers based on it, including edismax. There might be other query 
> parsers which won't see the impact of this change, but they can be updated 
> separately.
> How much performance benefit? Unknown, but supposedly significant. The goal 
> is simply to have a simple pure wildcard be the obvious tool to select fields 
> that have a value in a field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to