Martin - Thanks for the reply. I understand your answer about the segments. However, I'm still cloudy about faceting with respect to the group head. Perhaps an example will clarify my confusion. Suppose I have 3 order documents with the following data:
*orderNumber: 1 customerNumber: 1 totalInCents: 1500 productType: 'BOOK' orderNumber: 2 customerNumber: 1 totalInCents: 500 productType: 'BOOK' orderNumber: 3 customerNumber: 1 totalInCents: 1000 productType: 'DVD' * * *Imagine I perform a search for items greater than or equal to 1000 cents grouped by customer number. I would expect to get order numbers 1 and 3 back grouped underneath customer id. Lets assume that order number 1 is considered the most relevant document (in your scenario). Will the post group faceting miss that I actually have two facet values for productType: BOOK and DVD? Thanks! Josh On Fri, Aug 5, 2011 at 4:22 AM, Martijn v Groningen < martijn.is.h...@gmail.com> wrote: > Hi Josh, > > For post grouping the documents don't need to reside in the same segment. > Lucene's grouping module has a collector (TermAllGroupHeadsCollector) that > can > collect the most relevant document for each group (GroupHead). This > collector can produce a int[] or a FixedBitSet that can be used during > faceting to produce > post group facets (patch in SOLR-2665 uses this). During faceting only the > the groupheads are known, because of this field values that are different in > documents > less relevant than the most relevant document of a group aren't taken into > account. This is the same as in example described in the description of > LUCENE-3097. > Hope this helps! > > Martijn > > > On 4 August 2011 22:59, Joshua Harness <jkharnes...@gmail.com> wrote: > >> Hello - >> >> Please let me know if this question is more appropriate of the user >> list. I had assumed the developer list was more appropriate since the ticket >> is still open. I was analyzing the comments on >> LUCENE-3097<https://issues.apache.org/jira/browse/LUCENE-3097>and had a >> couple of questions. >> >> A >> comment<https://issues.apache.org/jira/browse/LUCENE-3097?focusedCommentId=13033953&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13033953>started >> a small thread that mentioned that all documents in a given group >> would need to be contiguous and in the same segment. Also - a statement was >> made that ' The app would have to ensure this'. I was unclear the result of >> this conversation. It sounded like maybe this could have turned out to not >> be the case. What is the status of this? Does my application have to ensure >> all the documents in the group are in the same segment? How would one >> accomplish this? >> >> Another >> comment<https://issues.apache.org/jira/browse/LUCENE-3097?focusedCommentId=13038297&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13038297>mentioned >> that 'we pick only the head doc...as long as the head doc is >> guaranteed to have the same value for field X, it safe to use that doc to >> represent the entire group for facet counting'. Does this mean that there >> is a restriction placed on me that the head document must have field values >> that match the rest of the documents in the same group? Or is this simply an >> implementation detail that uses the head document when this condition is the >> case or chooses another strategy when this is not the case? >> >> I am very interested in adopting this patch. However - I am >> attempting to understand any limitations/conditions so that I may use it >> correctly. Any advice would be greatly appreciated. >> >> Thanks! >> >> Josh Harness >> > > > > -- > Met vriendelijke groet, > > Martijn van Groningen >