[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector

Michael McCandless (JIRA) Mon, 18 Mar 2013 09:00:17 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605237#comment-13605237
 ]


Michael McCandless commented on LUCENE-4832:
--------------------------------------------

The Integer.MAX_VALUE change looks great!

But one thing I don't like about the accumulateGroups is there's now a separate 
(second) loop to sum up the totalGroupedHitCount.

Maybe accumulateGroups should do this itself, and then return TopGroups instead 
of GroupDocs<Integer>()?
                
> Unbounded getTopGroups for ToParentBlockJoinCollector
> -----------------------------------------------------
>
>                 Key: LUCENE-4832
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4832
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/join
>            Reporter: Aleksey Aleev
>         Attachments: LUCENE-4832.patch, LUCENE-4832.patch
>
>
> _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments:
> {code:java}
> public TopGroups<Integer> getTopGroups(ToParentBlockJoinQuery query, 
>                                        Sort withinGroupSort,
>                                        int offset,
>                                        int maxDocsPerGroup,
>                                        int withinGroupOffset,
>                                        boolean fillSortFields)
> {code}
> and one of them is {{maxDocsPerGroup}} which specifies upper bound of child 
> documents number returned within each group. 
> {{ToParentBlockJoinCollector}} collects and caches all child documents 
> matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during 
> search so it is possible to create {{GroupDocs}} with all matched child 
> documents instead of part of them bounded by {{maxDocsPerGroup}}.
> When you specify {{maxDocsPerGroup}} new queues(I mean 
> {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each 
> group with {{maxDocsPerGroup}} objects created within each queue which could 
> lead to redundant memory allocation in case of child documents number within 
> group is less than {{maxDocsPerGroup}}.
> I suppose that there are many cases where you need to get all child documents 
> matched by query so it could be nice to have ability to get top groups with 
> all matched child documents without unnecessary memory allocation. 
> Possible solution is to pass negative {{maxDocsPerGroup}} in case when you 
> need to get all matched child documents within each group and check 
> {{maxDocsPerGroup}} value: if it is negative then we need to create queue 
> with size of matched child documents number; otherwise create queue with size 
> equals to {{maxDocsPerGroup}}. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector

Reply via email to