Re: Selective Result Grouping
Created an issue in jira for this features: https://issues.apache.org/jira/browse/SOLR-2884 Martijn v Groningen-2 wrote: > > Ok I think I get this. I think this can be achieved if one could > specify a filter inside a group and only documents that pass the > filter get grouped. For example only group documents with the value > image for the mimetype field. This filter should be specified per > group command. Maybe we should open an issue for this? > -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3491886.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
Ok I think I get this. I think this can be achieved if one could specify a filter inside a group and only documents that pass the filter get grouped. For example only group documents with the value image for the mimetype field. This filter should be specified per group command. Maybe we should open an issue for this? Martijn On 1 November 2011 19:58, entdeveloper wrote: > > Martijn v Groningen-2 wrote: >> >> When using the group.field option values must be the same otherwise >> they don't get grouped together. Maybe fuzzy grouping would be nice. >> Grouping videos and images based on mimetype should be easy, right? >> Videos have a mimetype that start with video/ and images have a >> mimetype that start with image/. Storing the mime type's subtype and >> type in separate fields and group on the type field would do the job. >> Off course you need to know the mimetype during indexing, but >> solutions like Apache Tika can do that for you. > > Not necessarily interested in grouping by mimetype (that's an analysis > issue). I simply used videos and images as an example. > > I'm not sure what you mean by fuzzy grouping. But my goal is to have > collapse be more selective somehow on what gets grouped. As a more specific > example, I have a field called 'type', with the following possible field > values: > > Type > -- > image > video > webpage > > > Basically I want to be able to collapse all the images into a single result > so that they don't fill up the first page of the results. This is not > possible with the current grouping implementation because if you call > group.field=type, it'll group everything. I do not want to collapse videos > or webpages, only images. > > I've attached a screenshot of google's srp to help explain what I mean. > > http://lucene.472066.n3.nabble.com/file/n3471548/Screen_Shot_2011-11-01_at_11.52.04_AM.png > > Hopefully that makes more sense. If it's still not clear I can email you > privately. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3471548.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Met vriendelijke groet, Martijn van Groningen
Re: Selective Result Grouping
Martijn v Groningen-2 wrote: > > When using the group.field option values must be the same otherwise > they don't get grouped together. Maybe fuzzy grouping would be nice. > Grouping videos and images based on mimetype should be easy, right? > Videos have a mimetype that start with video/ and images have a > mimetype that start with image/. Storing the mime type's subtype and > type in separate fields and group on the type field would do the job. > Off course you need to know the mimetype during indexing, but > solutions like Apache Tika can do that for you. Not necessarily interested in grouping by mimetype (that's an analysis issue). I simply used videos and images as an example. I'm not sure what you mean by fuzzy grouping. But my goal is to have collapse be more selective somehow on what gets grouped. As a more specific example, I have a field called 'type', with the following possible field values: Type -- image video webpage Basically I want to be able to collapse all the images into a single result so that they don't fill up the first page of the results. This is not possible with the current grouping implementation because if you call group.field=type, it'll group everything. I do not want to collapse videos or webpages, only images. I've attached a screenshot of google's srp to help explain what I mean. http://lucene.472066.n3.nabble.com/file/n3471548/Screen_Shot_2011-11-01_at_11.52.04_AM.png Hopefully that makes more sense. If it's still not clear I can email you privately. -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3471548.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
> The current grouping functionality using group.field is basically > all-or-nothing: all documents will be grouped by the field value or none > will. So there would be no way to, for example, collapse just the videos or > images like they do in google. When using the group.field option values must be the same otherwise they don't get grouped together. Maybe fuzzy grouping would be nice. Grouping videos and images based on mimetype should be easy, right? Videos have a mimetype that start with video/ and images have a mimetype that start with image/. Storing the mime type's subtype and type in separate fields and group on the type field would do the job. Off course you need to know the mimetype during indexing, but solutions like Apache Tika can do that for you. -- Met vriendelijke groet, Martijn van Groningen
Re: Selective Result Grouping
Not necessarily collapse.type=adjacent. That is only when two docs with the same field value appear next to each other. I'm more concerned with the case where we only want a group of a certain type (no matter where the subsequent docs may be), leaving the rest of the documents ungrouped. The current grouping functionality using group.field is basically all-or-nothing: all documents will be grouped by the field value or none will. So there would be no way to, for example, collapse just the videos or images like they do in google. You're correct it would be difficult to support this in a sharded environment, but like most other features, it could be available in a single shard first and work toward supporting it in a sharded env. -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3429618.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selective Result Grouping
So if look at the old SOLR-236 fieldcollapsing (http://wiki.apache.org/solr/FieldCollapsingUncommitted) you mean collapse.type=adjacent ? I think we shouldn't change group.query parameter. Since it serves a different purpose. I think it is better to have a new parameter for this different way of grouping: group.adjacent=[fieldname|function] I also think it is difficult to support this feature in sharded environment. Since the merging the groups is based on the location of documents inside the result list. Martijn On 4 October 2011 02:00, entdeveloper wrote: > I'd like to suggest the ability to collapse results in a more similar way to > the old SOLR-236 patch that the current grouping functionality doesn't > provide. I need the ability to collapse only certain results based on the > value of a field, leaving all other results in tact. > > As an example, consider the following documents: > ID TYPE > 1 doc > 2 image > 3 image > 4 doc > > My desired behavior is to collapse results where TYPE:image, producing a > result set like the following: > 1 > 2 (collapsed, count=2) > 4 > > Currently, when using the Result Grouping feature, I only have the ability > to produce the result set below > 1 (grouped, count=2) > 2 (grouped, count=2) > > I'd like to propose repurposing the 'group.query' parameter to achieve this > behavior. Currently, the group.query parameter behaves exactly like an 'fq' > (at least in terms of the results that are produced). I have yet to come up > with a scenario where the group.query could not be accomplished by using the > other group params and fq. > > I'm hoping to collect some thoughts on the subject before submitting a > ticket to jira. Thoughts? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3391538.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Met vriendelijke groet, Martijn van Groningen
Selective Result Grouping
I'd like to suggest the ability to collapse results in a more similar way to the old SOLR-236 patch that the current grouping functionality doesn't provide. I need the ability to collapse only certain results based on the value of a field, leaving all other results in tact. As an example, consider the following documents: ID TYPE 1 doc 2 image 3 image 4 doc My desired behavior is to collapse results where TYPE:image, producing a result set like the following: 1 2 (collapsed, count=2) 4 Currently, when using the Result Grouping feature, I only have the ability to produce the result set below 1 (grouped, count=2) 2 (grouped, count=2) I'd like to propose repurposing the 'group.query' parameter to achieve this behavior. Currently, the group.query parameter behaves exactly like an 'fq' (at least in terms of the results that are produced). I have yet to come up with a scenario where the group.query could not be accomplished by using the other group params and fq. I'm hoping to collect some thoughts on the subject before submitting a ticket to jira. Thoughts? -- View this message in context: http://lucene.472066.n3.nabble.com/Selective-Result-Grouping-tp3391538p3391538.html Sent from the Solr - User mailing list archive at Nabble.com.