[ 
https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990570#comment-12990570
 ] 

Robert Muir commented on LUCENE-2886:
-------------------------------------

Hi Renaud:

The BulkVInt codec is VInt implemented as a FixedIntBlock codec.
So it reads a single numBytes Vint header, then a byte[], and decodes BLOCKSIZE 
vints out of it.
The reason for this, is it has much different performance than "StandardCodec",
due to the fact StandardCodec has to readByte() readByte() readByte() ...

You can see the code here: 
http://svn.apache.org/repos/asf/lucene/dev/branches/bulkpostings/lucene/src/java/org/apache/lucene/index/codecs/bulkvint/BulkVIntCodec.java

One reason for this, is to differentiate performance improvements of actual 
compression
algorithms from the way that they do their underlying I/O... previously various 
codecs
looked much faster than Vint but a lot of the reason for this is due to the way 
Vint
was implemented...

And yes, you are correct nebraska is a lower freq term. the +united +states is 
a more 
"normal" phrase query, but +nebraska +states is a phrase query intended to do a 
lot 
of advance()'ing... 


> Adaptive Frame Of Reference 
> ----------------------------
>
>                 Key: LUCENE-2886
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2886
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>
>         Attachments: LUCENE-2886_simple64.patch, 
> LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on 
> the lucene-4.0 branch.
> I am providing the source code of its implementation. Some work needs to be 
> done, as this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR 
> implementation, as well as the implementations of PFOR and of Simple64 
> (simple family codec working on 64bits word) that has been used in the 
> experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to