[ 
https://issues.apache.org/jira/browse/LUCENE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528879#comment-13528879
 ] 

Michael McCandless commented on LUCENE-4610:
--------------------------------------------

I think counting the parents ordinals on the fly is going to be much more 
costly than aggregating up only in the end?

I suspect that was a big part of the gains I saw, since it means we only count 
1 int not 3 in my test (but we should separately test it).  I realize that 
means the NoParentsAggregator would not be "general purpose", because you 
couldn't use it on multi-valued fields, but I suspect in the common case many 
facet dimensions are single-valued.

Also, for the multi-valued case, having NoParentsAccumulator that must dedup 
on-the-fly is likely to be expensive?  Ie I think it's likely for the 
multi-valued case that you will want to dedup at indexing time and store the 
full-path ords in the index (ie what we do today by default)?
                
> Implement a NoParentsAccumulator
> --------------------------------
>
>                 Key: LUCENE-4610
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4610
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>
> Mike experimented with encoding just the exact categories ordinals on 
> LUCENE-4602, and I added OrdinalPolicy.NO_PARENTS, with a comment saying that 
> this requires a special FacetsAccumulator.
> The idea is to write the exact categories only for each document, and then at 
> search time count up the parents chain to compute requested facets (I say 
> count, but it can be any weight).
> One limitation of such accumulator is that it cannot be used when e.g. a 
> document is associated with two categories who share the same parent, because 
> that may result in incorrect weights computed (e.g. a document might have 
> several Authors, and so counting the Author facet may yield wrong counts). So 
> it can be used only when the app knows it doesn't add such facets, or that it 
> always asks to aggregate a 'root' that in its path this criteria doesn't hold 
> (no categories share the same parent).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to