[ 
https://issues.apache.org/jira/browse/LUCENE-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697601#comment-13697601
 ] 

Paul Elschot commented on LUCENE-5084:
--------------------------------------

bq.  ... what is the CPU tradeoff for all these compressions and how does query 
speed compare across all of them?

We don't really know.
One can conclusion from the Vigna paper is that a tuned implementation of an 
Elias-Fano decoder is faster than a tuned PForDelta implementation for highly 
selective phrase queries. I would guess that that is because Elias-Fano uses 
random access to the low bits, where PForDelta only uses bulk decompression of 
the low bits, and Elias-Fano is faster at decoding its high bits than PForDelta 
is at decoding its exceptions.

One reason to use Elias-Fano for a DocIdSet here is that its high bits are 
encoded in unary coding which can easily be decoded in two directions, and that 
makes it useful for block joins. The other reason is that its compression is 
quite good, which makes it a nice candidate for in memory filters.
                
> EliasFanoDocIdSet
> -----------------
>
>                 Key: LUCENE-5084
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5084
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Paul Elschot
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-5084.patch
>
>
> DocIdSet in Elias-Fano encoding

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to