[ 
https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708590#action_12708590
 ] 

Paul Elschot commented on LUCENE-1410:
--------------------------------------

A very recent paper with some improvements to PFOR:
Yan, Ding, Suel,
Inverted Index Compression and Query Processing with Optimized Document 
Ordering,
WWW 2009, April 20-24 2009, Madrid, Spain

Roughly quoting par. 4.2, Optimizing PForDelta compression:
For an exception, we store its lower b bits instead of the offset to the next 
exception in its corresponding slot, while we store the higher overflow bits 
and the offset in two separate arrays. These two arrays are compressed using 
the Simple16 method.
Also b is chosen to optimize decompression speed. This makes the dependence of 
b on the data quite simple, (in the PFOR above here this dependence is more 
complex) and this improves compression speed.

Btw. the document ordering there is by URL. For many terms this gives more 
shorter delta's between doc ids allowing a higher decompression speed of the 
doc ids.


> PFOR implementation
> -------------------
>
>                 Key: LUCENE-1410
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1410
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: autogen.tgz, LUCENE-1410b.patch, LUCENE-1410c.patch, 
> LUCENE-1410d.patch, LUCENE-1410e.patch, TermQueryTests.tgz, TestPFor2.java, 
> TestPFor2.java, TestPFor2.java
>
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
>
> Implementation of Patched Frame of Reference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to