[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679618#action_12679618 ]
Stephen Weiss commented on SOLR-236: ------------------------------------ Unfortunately I don't think that will work for us. The collapse.maxdocs seems to collapse the oldest documents in the index - but we sort from newest to oldest, so effectively the newest documents in the index are just left out. Not only do they not collapse but they don't appear at all. If this is the only solution then we will have to stop using the patch... and unfortunately this means in general we will probably have to stop using Solr. The company has already made clear that this functionality is required, and especially since it has been working now for several months they will be very unlikely to accept that they can't have it anymore. Anyway I don't want to give up yet... I'm really not convinced this is really a problem of running out of the necessary memory to complete the operation - it only started doing this very recently. How does it run for 3 months with 2GB of RAM without any trouble, and now it fails even with 3GB of RAM? It's not like we just added those 200000 documents yesterday - they have accumulated over the past few months, in the past 3 days we've only perhaps added 20,000 documents. 20,000 more documents (with barely any new search terms at all) means it needs more than 1GB of memory more than what it was already using? If we grow by 25% every year that means by December we will need 50GB of RAM in the machine. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.3 > Reporter: Emmanuel Keller > Fix For: 1.5 > > Attachments: collapsing-patch-to-1.3.0-dieter.patch, > collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, > field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, > field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, solr-236.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.