[ 
https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802335#action_12802335
 ] 

Paul Elschot commented on LUCENE-1410:
--------------------------------------

The only reason why the number of compressed integers is encoded in the block 
header here is that when I coded it I did not know that this was not necessary 
in lucene indexes.

That also means that the header can be used for different compression methods, 
for example in the following way:
cases encoded in 1st byte:
32 FrameOfRef cases (#frameBits) followed by 3 bytes for #exceptions (0 for 
BITS, > 0 for PFOR)
16-64 cases for a SimpleNN variant
1-8 cases for run length encoding (for example followed by 3 bytes for length 
and value)
Total #cases is 49-104 or 6-7 bits.

Run length encoding is good for terms that occur in every document and for the 
frequencies of primary keys.

The only concern I have is that the instruction cache might get filled up with 
the code for all these decoding cases.
At the moment I don't know how to deal with that other than by adding such 
cases slowly while doing performance tests all the time.


> PFOR implementation
> -------------------
>
>                 Key: LUCENE-1410
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1410
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: autogen.tgz, LUCENE-1410-codecs.tar.bz2, 
> LUCENE-1410b.patch, LUCENE-1410c.patch, LUCENE-1410d.patch, 
> LUCENE-1410e.patch, TermQueryTests.tgz, TestPFor2.java, TestPFor2.java, 
> TestPFor2.java
>
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
>
> Implementation of Patched Frame of Reference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to