[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

Robert Muir (JIRA) Mon, 06 Nov 2017 12:53:43 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240867#comment-16240867
 ]


Robert Muir commented on LUCENE-8040:
-------------------------------------

Also I think as far as lowering the overhead to getting to a field, the better 
fix is probably in BlockTreeTermsReader. Today getting to a specific field is 
log N (TreeMap). Maybe it should be HashMap instead. 

Either linkedhashmap or separate sorted array can be used for the "iterator" 
functionality, but I think it currently optimizes for the wrong case (iterating 
fields in order, versus getting to a particular field).

> Optimize IndexSearcher.collectionStatistics
> -------------------------------------------
>
>                 Key: LUCENE-8040
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8040
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 7.2
>
>         Attachments: lucenecollectionStatisticsbench.zip
>
>
> {{IndexSearcher.collectionStatistics(field)}} can do a fair amount of work 
> because with each invocation it will call {{MultiFields.getTerms(...)}}.  The 
> effects of this are aggravated for queries with many fields since each field 
> will want statistics, and also aggravated when there are many segments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8040) Optimize IndexSearcher.collectionStatistics

Reply via email to