[
https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252929#comment-16252929
]
David Smiley commented on LUCENE-8040:
--------------------------------------
bq. no for 7.x you need to handle -1 case for stats, just like MultiTerms
currently does.
Oh yeah, thanks for the tip. So adding support for -1 stats would be pretty
annoying here... like this but for all 3:
Instead of
{code}
docCount += terms.getDocCount()
{code}
We have:
{code}
int tmpDC = terms.getDocCount();
docCount = tmpDC == -1 ? -1 : docCount + tmpDC;
{code}
But even then it's not completely equivalent if the stats are -1 in some
segments but not all. Do you think that matters [~rcmuir]? I'm tempted to
just not backport to 7x.
> Optimize IndexSearcher.collectionStatistics
> -------------------------------------------
>
> Key: LUCENE-8040
> URL: https://issues.apache.org/jira/browse/LUCENE-8040
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 7.2
>
> Attachments: LUCENE-8040.patch, LUCENE-8040.patch, MyBenchmark.java,
> lucenecollectionStatisticsbench.zip
>
>
> {{IndexSearcher.collectionStatistics(field)}} can do a fair amount of work
> because with each invocation it will call {{MultiFields.getTerms(...)}}. The
> effects of this are aggravated for queries with many fields since each field
> will want statistics, and also aggravated when there are many segments.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]