On 16 February 2013 11:45, Robert Muir <rcm...@gmail.com> wrote: > But forcing that wouldn't be testing the 4.1 index format, it would be > something else (something not interesting). >
Do you mind if I have my own share of knowledge and have my idea about interesting benchmarks? :) You didn't answer, but the undertext *seems* that counts are no longer interleaved. Again, is it the case? Forcing a count is an essential test for the index efficiency, as you need counts to do scoring. Testing with a scorer is not a good idea because the scorer CPU usage is difficult to control across different implementations. So the only way of testing a non-interleaved index against an interleaved index (or comparing the speed of count access against a non-interleaved index) is to force a count reading without any other activity. 4.1 index format mixes with variable byte because its more efficient than using FOR everywhere. This means FOR blocks in this format are always size 128. The remainder is encoded as vbyte. So essentially you code every blocks of 128 postings using FOR, but fall back to VByte for the tail ( <128). For low-frequency terms, this means just VByte. Right?