Hey! Got it working! The problem was that my uniqueField is indexed as long and it's not suported by the patch. The value is obtained in getCollapseGroupResult function in AbstarctCollapseCollector.java as:
String schemaId = searcher.doc(docId).get(uniqueIdFieldname); To suport long,int,slong,sint,float,sfloat... It should be obtaining doing somenthing like: FieldType idFieldType = searcher.getSchema().getFieldType(uniqueIdFieldname); String schemaId = ""; Fieldable name_field = null; try { name_field = searcher.doc(id).getFieldable(uniqueIdFieldname); } catch (IOException ex) { //deal with exception } if (name_field != null) { schemaId = idFieldType.storedToReadable(name_field); } Martijn v Groningen wrote: > > The last two parameters are not necessary, since they default both to > true. Could you run the field collapse tests tests successful? > > 2009/12/7 Marc Sturlese <marc.sturl...@gmail.com>: >> >> The request I am sending is: >> http://localhost:8983/solr/select/?q=aaa&version=2.2&start=0&rows=20&indent=on&collapse.field=col&collapse.includeCollapsedDocs.fl=*&collapse.type=adjacent&collapse.info.doc=true&collapse.info.count=true >> >> I search for 'aaa' in the content field. All the documents in the result >> contain that string in the field content >> >> Martijn v Groningen wrote: >>> >>> Yes it should look similar to that. What is the exact request you send >>> to >>> Solr? >>> Also to check if the patch works correctly can you run: ant clean test >>> There are a number of tests that test the Field collapse functionality. >>> >>> Martijn >>> >>> >>> 2009/12/7 Marc Sturlese <marc.sturl...@gmail.com>: >>>> >>>>><lst name="collapse_counts"> >>>>> <str name="field">cat</str> >>>>> <lst name="results"> >>>>> <lst name="009"> >>>>> <str name="fieldValue">hard</str> >>>>> <int name="collapseCount">1</int> >>>>> <result name="collapsedDocs" numFound="1" start="0"> >>>>> <doc> >>>>> <long name="id">008</long> >>>>> <str name="content">aaa aaa</str> >>>>> <str name="col">ccc</str> >>>>> </doc> >>>>> </result> >>>>> </lst> >>>>> ... >>>>> </lst> >>>>></lst> >>>> I see, looks like I am applying the patch wrongly somehow. >>>> This the complete collapse_counts response I am getting: >>>> <lst name="collapse_counts"> >>>> <str name="field">col</str> >>>> <lst name="results"> >>>> <lst> >>>> <int name="collapseCount">1</int> >>>> <int name="collapseCount">1</int> >>>> <int name="collapseCount">1</int> >>>> <str name="fieldValue">bbb</str> >>>> <str name="fieldValue">ccc</str> >>>> <str name="fieldValue">xxx</str> >>>> <result name="collapsedDocs" numFound="1" start="0"> >>>> <doc> >>>> <long name="id">2</long> >>>> <str name="content">aaa aaa</str> >>>> <str name="col">bbb</str> >>>> </doc> >>>> </result> >>>> <result name="collapsedDocs" numFound="1" start="0"> >>>> <doc> >>>> <long name="id">8</long> >>>> <str name="content">aaa aaa aaa sd</str> >>>> <str name="col">ccc</str> >>>> </doc> >>>> </result> >>>> <result name="collapsedDocs" numFound="4" start="0"> >>>> <doc> >>>> <long name="id">12</long> >>>> <str name="content">aaa aaa aaa v</str> >>>> <str name="col">xxx</str> >>>> </doc> >>>> </result> >>>> </lst> >>>> </lst> >>>> </lst> >>>> >>>> As you can see I am getting a <lst> tag with no name. As I understood >>>> what >>>> you told me. I should be getting as many lst tags as collapsed groups >>>> and >>>> the name attribute of the lst should be the unique field value. So, if >>>> the >>>> patch was applyed correcly teh response should look like: >>>> >>>> <lst name="collapse_counts"> >>>> <str name="field">col</str> >>>> <lst name="results"> >>>> <lst name="354> (the head value of the collapsed group) >>>> <int name="collapseCount">1</int> >>>> <str name="fieldValue">bbb</str> >>>> <result name="collapsedDocs" numFound="1" start="0"> >>>> <doc> >>>> <long name="id">2</long> >>>> <str name="content">aaa aaa</str> >>>> <str name="col">bbb</str> >>>> </doc> >>>> </result> >>>> </lst> >>>> <lst name="654"> >>>> <int name="collapseCount">1</int> >>>> <str name="fieldValue">ccc</str> >>>> <result name="collapsedDocs" numFound="1" start="0"> >>>> <doc> >>>> <long name="id">8</long> >>>> <str name="content">aaa aaa aaa sd</str> >>>> <str name="col">ccc</str> >>>> </doc> >>>> </result> >>>> </lst> >>>> <lst name="654"> >>>> <int name="collapseCount">1</int> >>>> <str name="fieldValue">xxx</str> >>>> <result name="collapsedDocs" numFound="4" start="0"> >>>> <doc> >>>> <long name="id">12</long> >>>> <str name="content">aaa aaa aaa v</str> >>>> <str name="col">xxx</str> >>>> </doc> >>>> </result> >>>> </lst> >>>> </lst> >>>> </lst> >>>> >>>> Is this the way the response looks like when you use teh patch? >>>> Thanks in advance >>>> >>>> >>>> Martijn v Groningen wrote: >>>>> >>>>> Hi Marc, >>>>> >>>>> I'm not sure if I follow you completely, but the example you gave is >>>>> not complete. I'm missing a few tags in your example. Lets assume the >>>>> following response that the latest patches produce. >>>>> >>>>> <lst name="collapse_counts"> >>>>> <str name="field">cat</str> >>>>> <lst name="results"> >>>>> <lst name="009"> >>>>> <str name="fieldValue">hard</str> >>>>> <int name="collapseCount">1</int> >>>>> <result name="collapsedDocs" numFound="1" start="0"> >>>>> <doc> >>>>> <long name="id">008</long> >>>>> <str name="content">aaa aaa</str> >>>>> <str name="col">ccc</str> >>>>> </doc> >>>>> </result> >>>>> </lst> >>>>> ... >>>>> </lst> >>>>> </lst> >>>>> >>>>> The result list contains collapse groups. The name of the child >>>>> elements are the collapse head ids. Everything that falls under the >>>>> collapse head belongs to that collapse group and thus adding document >>>>> head id to the field value is unnecessary. In the above example >>>>> document with id 009 is the document head of document with id 008. >>>>> Document with id 009 should be displayed in the search result. >>>>> >>>>> From what you have said, it seems that you properly configured the >>>>> patch. >>>>> >>>>> Martijn >>>>> >>>>> 2009/12/7 Marc Sturlese <marc.sturl...@gmail.com>: >>>>>> >>>>>> Hey there, I have beeb testing the last patch and I think or I am >>>>>> missing >>>>>> something or the way to show the collapsed documents when adjacent >>>>>> collapse >>>>>> can be sometimes confusing: >>>>>> I am using the patch replacing queryComponent for collapseComponent >>>>>> (not >>>>>> using both at same time): >>>>>> <searchComponent name="query" >>>>>> class="org.apache.solr.handler.component.CollapseComponent"> >>>>>> What I have noticed is, imagin you get these results in the search: >>>>>> doc1: >>>>>> id:001 >>>>>> collapseField:ccc >>>>>> doc2: >>>>>> id:002 >>>>>> collapseField:aaa >>>>>> doc3: >>>>>> id:003 >>>>>> collapseField:ccc >>>>>> doc4: >>>>>> id:004 >>>>>> collapseField:bbb >>>>>> >>>>>> And in the collapse_counts you get: >>>>>> <int name="collapseCount">1</int> >>>>>> <str name="fieldValue">ccc</str> >>>>>> <result name="collapsedDocs" numFound="1" start="0"> >>>>>> <doc> >>>>>> <long name="id">008</long> >>>>>> <str name="content">aaa aaa</str> >>>>>> <str name="col">ccc</str> >>>>>> </doc> >>>>>> </result> >>>>>> >>>>>> Now, how can I know the head document of doc 008? Both 001 and 003 >>>>>> could >>>>>> be... wouldn't make sense to connect in someway the uniqueField with >>>>>> the >>>>>> collapsed documents? >>>>>> >>>>>> Adding something to collapse_counts like: >>>>>> <int name="collapseCount">1</int> >>>>>> <str name="fieldValue">ccc</str> >>>>>> <str name="uniqueFieldId">003</str> >>>>>> >>>>>> I currently have hacked FieldValueCountCollapseCollectorFactory to >>>>>> return: >>>>>> <str name="fieldValue">ccc#003</str> >>>>>> but this respose looks dirty... >>>>>> >>>>>> As I said maybe I am missunderstanding something and this can be >>>>>> knwon >>>>>> in >>>>>> someway. In that case can someone tell me how? >>>>>> Thanks in advance >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> JIRA j...@apache.org wrote: >>>>>>> >>>>>>> >>>>>>> [ >>>>>>> https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783484#action_12783484 >>>>>>> ] >>>>>>> >>>>>>> Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 >>>>>>> PM: >>>>>>> ---------------------------------------------------------------------- >>>>>>> >>>>>>> I have attached a new patch that has the following changes: >>>>>>> # Added caching for the field collapse functionality. Check the >>>>>>> [solr >>>>>>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to >>>>>>> configure >>>>>>> field-collapsing with caching. >>>>>>> # Removed the collapse.max parameter (collapse.threshold must be >>>>>>> used >>>>>>> instead). It was deprecated for a long time. >>>>>>> >>>>>>> was (Author: martijn): >>>>>>> I have attached a new patch that has the following changes: >>>>>>> # Added caching for the field collapse functionality. Check the >>>>>>> [solr >>>>>>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to >>>>>>> configure >>>>>>> the >>>>>>> field-collapsing with caching. >>>>>>> # Removed the collapse.max parameter (collapse.threshold must be >>>>>>> used >>>>>>> instead). It was deprecated for a long time. >>>>>>> >>>>>>>> Field collapsing >>>>>>>> ---------------- >>>>>>>> >>>>>>>> Key: SOLR-236 >>>>>>>> URL: https://issues.apache.org/jira/browse/SOLR-236 >>>>>>>> Project: Solr >>>>>>>> Issue Type: New Feature >>>>>>>> Components: search >>>>>>>> Affects Versions: 1.3 >>>>>>>> Reporter: Emmanuel Keller >>>>>>>> Fix For: 1.5 >>>>>>>> >>>>>>>> Attachments: collapsing-patch-to-1.3.0-dieter.patch, >>>>>>>> collapsing-patch-to-1.3.0-ivan.patch, >>>>>>>> collapsing-patch-to-1.3.0-ivan_2.patch, >>>>>>>> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, >>>>>>>> field-collapse-4-with-solrj.patch, field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, field-collapse-5.patch, >>>>>>>> field-collapse-5.patch, >>>>>>>> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, >>>>>>>> field-collapsing-extended-592129.patch, >>>>>>>> field_collapsing_1.1.0.patch, >>>>>>>> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, >>>>>>>> field_collapsing_dsteigerwald.diff, >>>>>>>> field_collapsing_dsteigerwald.diff, >>>>>>>> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, >>>>>>>> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, >>>>>>>> solr-236.patch, SOLR-236_collapsing.patch, >>>>>>>> SOLR-236_collapsing.patch >>>>>>>> >>>>>>>> >>>>>>>> This patch include a new feature called "Field collapsing". >>>>>>>> "Used in order to collapse a group of results with similar value >>>>>>>> for >>>>>>>> a >>>>>>>> given field to a single entry in the result set. Site collapsing is >>>>>>>> a >>>>>>>> special case of this, where all results for a given web site is >>>>>>>> collapsed >>>>>>>> into one or two entries in the result set, typically with an >>>>>>>> associated >>>>>>>> "more documents from this site" link. See also Duplicate >>>>>>>> detection." >>>>>>>> http://www.fastsearch.com/glossary.aspx?m=48&amid=299 >>>>>>>> The implementation add 3 new query parameters (SolrParams): >>>>>>>> "collapse.field" to choose the field used to group results >>>>>>>> "collapse.type" normal (default value) or adjacent >>>>>>>> "collapse.max" to select how many continuous results are allowed >>>>>>>> before >>>>>>>> collapsing >>>>>>>> TODO (in progress): >>>>>>>> - More documentation (on source code) >>>>>>>> - Test cases >>>>>>>> Two patches: >>>>>>>> - "field_collapsing.patch" for current development version >>>>>>>> - "field_collapsing_1.1.0.patch" for Solr-1.1.0 >>>>>>>> P.S.: Feedback and misspelling correction are welcome ;-) >>>>>>> >>>>>>> -- >>>>>>> This message is automatically generated by JIRA. >>>>>>> - >>>>>>> You can reply to this email to add a comment to the issue online. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26674651.html >>>>>> Sent from the Solr - Dev mailing list archive at Nabble.com. >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Met vriendelijke groet, >>>>> >>>>> Martijn van Groningen >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26678606.html >>>> Sent from the Solr - Dev mailing list archive at Nabble.com. >>>> >>>> >>> >>> >>> >>> -- >>> Met vriendelijke groet, >>> >>> Martijn van Groningen >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26679037.html >> Sent from the Solr - Dev mailing list archive at Nabble.com. >> >> > > > > -- > Met vriendelijke groet, > > Martijn van Groningen > > -- View this message in context: http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26679520.html Sent from the Solr - Dev mailing list archive at Nabble.com.