[ 
https://issues.apache.org/jira/browse/LUCENE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262817#comment-15262817
 ] 

Robert Muir commented on LUCENE-7262:
-------------------------------------

So this means postings still calls cardinality()? Why wouldn't it do the same? 
I'm a bit concerned with each query tracking its own estimate (and having the 
formula/stats pulling etc duplicated everywhere). 

This is why when looking at MatchingPoints, it pulls the stats it needs. but 
alternatively DocIDSetBuilder could take parameters of sumDocFreq, maxDoc, 
docCount and do this itself. Points would pass size() for sumDocFreq, its the 
equivalent there.

In other words, i see providing a good cost() as the responsibility of 
DocIDSetBuilder. The only thing impl-specific is how to get sumDocFreq and 
docCount (e.g. Terms.sumDocFreq/docCount vs PointValues.size/docCount).

> Add back the "estimate match count" optimization
> ------------------------------------------------
>
>                 Key: LUCENE-7262
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7262
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7262.patch
>
>
> Follow-up to my last message on LUCENE-7051: I removed this optimization a 
> while ago because it made things a bit more complicated but did not seem to 
> help with point queries. However the reason why it did not seem to help was 
> that the benchmark only runs queries that match 25% of the dataset. This 
> makes the run time completely dominated by calls to FixedBitSet.set so the 
> call to FixedBitSet.cardinality() looks free. However with slightly sparser 
> queries like the geo benchmark generates (dense enough to trigger the 
> creation of a FixedBitSet but sparse enough so that FixedBitSet.set does not 
> dominate the run time), one can notice speed-ups when this call is skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to