[
https://issues.apache.org/jira/browse/LUCENE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453811#comment-13453811
]
Uwe Schindler commented on LUCENE-4376:
---------------------------------------
The filter is already there, just QueryParser does not support this. To make
this work for your use case, you can override Lucene's/Solr's QueryParser to
return ConstantScoreQuery() with the LUCENE-3593 filter as replacement for the
"field:*" only query. The positive and negative variant works using the boolean
to the filter.
To conclude: The Query is already there, no need for the 2 new classes. The
wanted functionality is:
{code:java}
new ConstantScoreQuery(new FieldValueFilter(String field, boolean negate))
{code}
To find all document with any term in the field use negate=false, otherwise
negate=true. There is absolutely no need for a Query.
bq. Okay, so would it be straightforward and super-efficient for PrefixQuery to
do exactly that if the prefix term is zero-length?
Thats super-slow as it will search for all terms in the field. This is what
e.g. Solr is doing currently for the "field:*" queries. Solr should use the
filter, too, this would make that much more efficient.
> Add Query subclasses for selecting documents where a field is empty or not
> --------------------------------------------------------------------------
>
> Key: LUCENE-4376
> URL: https://issues.apache.org/jira/browse/LUCENE-4376
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/query/scoring
> Reporter: Jack Krupansky
> Fix For: 5.0
>
>
> Users frequently wish to select documents based on whether a specified
> sparsely-populated field has a value or not. Lucene should provide specific
> Query subclasses that optimize for these two cases, rather than force users
> to guess what workaround might be most efficient. It is simplest for users to
> use a simple pure wildcard term to check for non-empty fields or a negated
> pure wildcard term to check for empty fields, but it has been suggested that
> this can be rather inefficient, especially for text fields with many terms.
> 1. Add NonEmptyFieldQuery - selects all documents that have a value for the
> specified field.
> 2. Add EmptyFieldQuery - selects all documents that do not have a value for
> the specified field.
> The query parsers could turn a pure wildcard query (asterisk only) into a
> NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery.
> Alternatively, maybe PrefixQuery could detect pure wildcard and automatically
> "rewrite" it into NonEmptyFieldQuery.
> My assumption is that if the actual values of the field are not needed,
> Lucene can much more efficiently simply detect whether values are present,
> rather than, for example, the user having to create a separate boolean "has
> value" field that they would query for true or false.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]