On Sat, Feb 16, 2013 at 5:40 AM, Sebastiano Vigna <vi...@di.unimi.it> wrote:
> I'd like to redo the benchmarks published on MG4J's home page with Lucene 
> 4.1. However, for this I'd need to know whether when using PForDelta coding 
> the counts (a.k.a. within-document frequencies) are stored interleaved with 
> the document pointers as in 3.6.2 (and, if not so, the cheapest way to force 
> a count read for each returned document, even modifiying the code if it's 
> more efficient than otherwise).
>
> It would also be important for me to force PForDelta everywhere, if possible, 
> as the point is benchmarking different index representations, and mixing with 
> variable-byte makes the benchmark difficult to interpret.

But forcing that wouldn't be testing the 4.1 index format, it would be
something else (something not interesting).

4.1 index format mixes with variable byte because its more efficient
than using FOR everywhere. This means FOR blocks in this format are
always size 128. The remainder is encoded as vbyte.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to