Jack Krupansky created LUCENE-4376:
--------------------------------------

             Summary: Add Query subclasses for selecting documents where a 
field is empty or not
                 Key: LUCENE-4376
                 URL: https://issues.apache.org/jira/browse/LUCENE-4376
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/query/scoring
            Reporter: Jack Krupansky
             Fix For: 5.0


Users frequently wish to select documents based on whether a specified 
sparsely-populated field has a value or not. Lucene should provide specific 
Query subclasses that optimize for these two cases, rather than force users to 
guess what workaround might be most efficient. It is simplest for users to use 
a simple pure wildcard term to check for non-empty fields or a negated pure 
wildcard term to check for empty fields, but it has been suggested that this 
can be rather inefficient, especially for text fields with many terms.

1. Add NonEmptyFieldQuery - selects all documents that have a value for the 
specified field.
2. Add EmptyFieldQuery - selects all documents that do not have a value for the 
specified field.

The query parsers could turn a pure wildcard query (asterisk only) into a 
NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery.

Alternatively, maybe PrefixQuery could detect pure wildcard and automatically 
"rewrite" it into NonEmptyFieldQuery.

My assumption is that if the actual values of the field are not needed, Lucene 
can much more efficiently simply detect whether values are present, rather 
than, for example, the user having to create a separate boolean "has value" 
field that they would query for true or false.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to