[jira] [Commented] (LUCENE-6301) Deprecate Filter

Adrien Grand (JIRA) Mon, 12 Oct 2015 06:35:17 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953124#comment-14953124
 ]


Adrien Grand commented on LUCENE-6301:
--------------------------------------

bq. wasn't Filter supposed to be a big performance win over a Query since it 
eliminates the performance impact of scoring? If that was the case, is Lucene 
proving some alternate method of achieving a similar performance improvement?

Over the past releases, we progressively improved to Query/Collector API so 
that queries can detect whether scores are needed and optimize in case scores 
are not needed in order to eg. avoid to read frequencies or stop after the 
first occurence is found in the case of phrase queries (LUCENE-6218). 
Everything is detected automatically now, for instance if you wrap a query in a 
ConstantScoreQuery, it will automatically notice that scores are not needed. If 
you sort by the value of a field and don't request scores, then again it will 
notice that scores are not needed and optimize query execution.

Something else that Filters provided but not queries was random-access support. 
But it was a bit incomplete since Filters had no way to tell FilteredQuery if 
they should rather be consumed using iteration or random-access and making the 
wrong decision could sometimes result in super slow queries that would try to 
call advance() on a DocValuesRangeQuery which doesn't use an index and needs to 
perform a linear scan in order to locate the next match. So we added two-phase 
iteration support to queries (LUCENE-6198) which allows us to dissert queries 
into a fast approximation and a slow verification phase. For instance, a phrase 
query "A B" would return the conjunction (+A +B) as an approximation and check 
if it can find the two terms at consecutive positions as a verification phase.

bq. that would have exactly the same effect (and performance gain) as the old 
Filter class. Is that statement 100% accurate?

If you use a query that provides an efficient approximation (such as phrase 
queries) as a filter, things could be considerably faster. Otherwise, things 
will mostly work the same way as before and you could have slight speedups or 
slowdowns given that we use different code paths that hotspot might optimize 
differently.

I will look into the deprecation comments for Filter.

> Deprecate Filter
> ----------------
>
>                 Key: LUCENE-6301
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6301
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>             Fix For: 5.2, Trunk
>
>         Attachments: LUCENE-6301.patch, LUCENE-6301.patch
>
>
> It will still take time to completely remove Filter, but I think we should 
> start deprecating it now to state our intention and encourage users to move 
> to queries as soon as possible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6301) Deprecate Filter

Reply via email to