[ 
https://issues.apache.org/jira/browse/LUCENE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453558#comment-13453558
 ] 

Jack Krupansky commented on LUCENE-4376:
----------------------------------------

I don't see how the filter from LUCENE-3593 would be used within a Query.

As an example, derived from a recent email inquiry, the user may enter a query 
such as:

{code}
foo bar AND imageUrl:*
{code}

Meaning, find documents with those two keywords where the imageUrl field is not 
empty.

Sure, if they are writing raw Lucene they can manually apply the filter, but I 
want to have that filtering applied within the Query.

Further, maybe only a much smaller subset of documents meet the non-empty 
filter requirement. I would want the Query to be as fast as possible.

                
> Add Query subclasses for selecting documents where a field is empty or not
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-4376
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4376
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>            Reporter: Jack Krupansky
>             Fix For: 5.0
>
>
> Users frequently wish to select documents based on whether a specified 
> sparsely-populated field has a value or not. Lucene should provide specific 
> Query subclasses that optimize for these two cases, rather than force users 
> to guess what workaround might be most efficient. It is simplest for users to 
> use a simple pure wildcard term to check for non-empty fields or a negated 
> pure wildcard term to check for empty fields, but it has been suggested that 
> this can be rather inefficient, especially for text fields with many terms.
> 1. Add NonEmptyFieldQuery - selects all documents that have a value for the 
> specified field.
> 2. Add EmptyFieldQuery - selects all documents that do not have a value for 
> the specified field.
> The query parsers could turn a pure wildcard query (asterisk only) into a 
> NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery.
> Alternatively, maybe PrefixQuery could detect pure wildcard and automatically 
> "rewrite" it into NonEmptyFieldQuery.
> My assumption is that if the actual values of the field are not needed, 
> Lucene can much more efficiently simply detect whether values are present, 
> rather than, for example, the user having to create a separate boolean "has 
> value" field that they would query for true or false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to