[
https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605237#comment-13605237
]
Michael McCandless commented on LUCENE-4832:
--------------------------------------------
The Integer.MAX_VALUE change looks great!
But one thing I don't like about the accumulateGroups is there's now a separate
(second) loop to sum up the totalGroupedHitCount.
Maybe accumulateGroups should do this itself, and then return TopGroups instead
of GroupDocs<Integer>()?
> Unbounded getTopGroups for ToParentBlockJoinCollector
> -----------------------------------------------------
>
> Key: LUCENE-4832
> URL: https://issues.apache.org/jira/browse/LUCENE-4832
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/join
> Reporter: Aleksey Aleev
> Attachments: LUCENE-4832.patch, LUCENE-4832.patch
>
>
> _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments:
> {code:java}
> public TopGroups<Integer> getTopGroups(ToParentBlockJoinQuery query,
> Sort withinGroupSort,
> int offset,
> int maxDocsPerGroup,
> int withinGroupOffset,
> boolean fillSortFields)
> {code}
> and one of them is {{maxDocsPerGroup}} which specifies upper bound of child
> documents number returned within each group.
> {{ToParentBlockJoinCollector}} collects and caches all child documents
> matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during
> search so it is possible to create {{GroupDocs}} with all matched child
> documents instead of part of them bounded by {{maxDocsPerGroup}}.
> When you specify {{maxDocsPerGroup}} new queues(I mean
> {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each
> group with {{maxDocsPerGroup}} objects created within each queue which could
> lead to redundant memory allocation in case of child documents number within
> group is less than {{maxDocsPerGroup}}.
> I suppose that there are many cases where you need to get all child documents
> matched by query so it could be nice to have ability to get top groups with
> all matched child documents without unnecessary memory allocation.
> Possible solution is to pass negative {{maxDocsPerGroup}} in case when you
> need to get all matched child documents within each group and check
> {{maxDocsPerGroup}} value: if it is negative then we need to create queue
> with size of matched child documents number; otherwise create queue with size
> equals to {{maxDocsPerGroup}}.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]