[jira] [Commented] (LUCENE-4609) Write a PackedIntsEncoder/Decoder for facets

Shai Erera (JIRA) Mon, 28 Jan 2013 06:55:15 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564314#comment-13564314
 ]


Shai Erera commented on LUCENE-4609:
------------------------------------

Thanks Adrien. From what I can tell, you implemented a PackedInts version of 
what used to be CategoryListCache (per-document ordinals). The patch has many 
changes to what I think is unrelated to facets code, but I get the basic idea. 
So if we had a CategoryListCache interface, we'd have several implementations 
thus far: StraightIntsCache (Mike's int[] version with offsets), 
PackedIntsCache (regular PackedInts) and AdriensEfficientPackedIntsCache (your 
version :)). Right?

Also, this issue started in order to explore an alternative encoder/decoder for 
per-document category ordinals. Is it ok to conclude that none seemed more 
efficient than dgap+vint?

Regarding caching, it seems that the Straight impl beats everything so far, 
even packed-ints? I mean, it achieved 50% gains for some queries, while 
comparing HighTerm of both versions, straight achieves 38%, while packed only 
4%. Yet both straight and packed versions consume exactly the same amount of 
RAM, so there's no real tradeoff here. It doesn't look like there's any 
advantage to using the packed version?
                
> Write a PackedIntsEncoder/Decoder for facets
> --------------------------------------------
>
>                 Key: LUCENE-4609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4609
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Priority: Minor
>         Attachments: LUCENE-4609.patch, LUCENE-4609.patch, LUCENE-4609.patch, 
> LUCENE-4609.patch, LUCENE-4609.patch
>
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4609) Write a PackedIntsEncoder/Decoder for facets

Reply via email to