On Sat, Feb 16, 2013 at 5:40 AM, Sebastiano Vigna <vi...@di.unimi.it> wrote: > I'd like to redo the benchmarks published on MG4J's home page with Lucene > 4.1. However, for this I'd need to know whether when using PForDelta coding > the counts (a.k.a. within-document frequencies) are stored interleaved with > the document pointers as in 3.6.2 (and, if not so, the cheapest way to force > a count read for each returned document, even modifiying the code if it's > more efficient than otherwise). > > It would also be important for me to force PForDelta everywhere, if possible, > as the point is benchmarking different index representations, and mixing with > variable-byte makes the benchmark difficult to interpret.
But forcing that wouldn't be testing the 4.1 index format, it would be something else (something not interesting). 4.1 index format mixes with variable byte because its more efficient than using FOR everywhere. This means FOR blocks in this format are always size 128. The remainder is encoded as vbyte. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org