[ 
https://issues.apache.org/jira/browse/LUCENE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453570#comment-13453570
 ] 

Jack Krupansky commented on LUCENE-4376:
----------------------------------------

Okay, so would it be straightforward and super-efficient for PrefixQuery to do 
exactly that if the prefix term is zero-length?

I think that would transparently provide the desired benefit for parsed queries.

                
> Add Query subclasses for selecting documents where a field is empty or not
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-4376
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4376
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>            Reporter: Jack Krupansky
>             Fix For: 5.0
>
>
> Users frequently wish to select documents based on whether a specified 
> sparsely-populated field has a value or not. Lucene should provide specific 
> Query subclasses that optimize for these two cases, rather than force users 
> to guess what workaround might be most efficient. It is simplest for users to 
> use a simple pure wildcard term to check for non-empty fields or a negated 
> pure wildcard term to check for empty fields, but it has been suggested that 
> this can be rather inefficient, especially for text fields with many terms.
> 1. Add NonEmptyFieldQuery - selects all documents that have a value for the 
> specified field.
> 2. Add EmptyFieldQuery - selects all documents that do not have a value for 
> the specified field.
> The query parsers could turn a pure wildcard query (asterisk only) into a 
> NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery.
> Alternatively, maybe PrefixQuery could detect pure wildcard and automatically 
> "rewrite" it into NonEmptyFieldQuery.
> My assumption is that if the actual values of the field are not needed, 
> Lucene can much more efficiently simply detect whether values are present, 
> rather than, for example, the user having to create a separate boolean "has 
> value" field that they would query for true or false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to