Hi
   I tried to integrate PForDelta into lucene 2.9 but confronted a problem.
   I use the implementation in
http://code.google.com/p/integer-array-compress-kit/
   it implements a basic PForDelta algorithm and an improved one(which
called NewPForDelta, but there are many bugs and I have fixed them),
   But compare it with VInt and S9, it's speed is very slow when only
decode small number of integer arrays.
   e.g. when I decoded int[256] arrays which values are randomly
generated between 0 and 100, if decode just one array. PFor(or
NewPFor) is very slow. when it continuously decodes many arrays such
as 10000, it's faster than s9 and vint.
   Another strange phenomena is that when call PFor decoder twice, the
2nd times it's faster. Or I call PFor first then NewPFor, the NewPFor
is faster. reverse the call sequcence, the 2nd called decoder is
faster
   e.g.
                ct.testNewPFDCodes(list);
                ct.testPFor(list);
                ct.testVInt(list);
                ct.testS9(list);

NewPFD decode: 3614705
PForDelta decode: 17320
VINT decode: 16483
S9 decode: 19835
when I call by the following sequence

                ct.testPFor(list);
                ct.testNewPFDCodes(list);
                ct.testVInt(list);
                ct.testS9(list);

PForDelta decode: 3212140
NewPFD decode: 19556
VINT decode: 16762
S9 decode: 16483

   My implementation is -- group docIDs and termDocFreqs into block
which contains 128 integers. when SegmentTermDocs's next method
called(or read readNoTf).it decodes a block and save it to a cache.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to