[ 
https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701669#comment-13701669
 ] 

Martijn van Groningen commented on LUCENE-3972:
-----------------------------------------------

I noticed that the real cost was in the setNextReader method, in that method 
the collected group values are re-based into ordinals that are valid for the 
new upcoming segment (binary search for each collected term). But this is what 
makes the ordinals comparison in the collect method actually work.

Reducing the number of segments, reduces the number of setNextReader 
invocations and makes this feature faster. If you only have one segment 
(optimized index), then the rebasing of the group values doesn't occur, and the 
AllGroupsCollector should be much faster.

I think that having a hybrid solution that would change the impl when a 
predefined number of groups have been found would make the current approach 
better, but this would never be faster as just having one segment (or 
global/index level ordinals).
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, 
> DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by 
> using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to