Hi I tried to integrate PForDelta into lucene 2.9 but confronted a problem. I use the implementation in http://code.google.com/p/integer-array-compress-kit/ it implements a basic PForDelta algorithm and an improved one(which called NewPForDelta, but there are many bugs and I have fixed them), But compare it with VInt and S9, it's speed is very slow when only decode small number of integer arrays. e.g. when I decoded int[256] arrays which values are randomly generated between 0 and 100, if decode just one array. PFor(or NewPFor) is very slow. when it continuously decodes many arrays such as 10000, it's faster than s9 and vint. Another strange phenomena is that when call PFor decoder twice, the 2nd times it's faster. Or I call PFor first then NewPFor, the NewPFor is faster. reverse the call sequcence, the 2nd called decoder is faster e.g. ct.testNewPFDCodes(list); ct.testPFor(list); ct.testVInt(list); ct.testS9(list);
NewPFD decode: 3614705 PForDelta decode: 17320 VINT decode: 16483 S9 decode: 19835 when I call by the following sequence ct.testPFor(list); ct.testNewPFDCodes(list); ct.testVInt(list); ct.testS9(list); PForDelta decode: 3212140 NewPFD decode: 19556 VINT decode: 16762 S9 decode: 16483 My implementation is -- group docIDs and termDocFreqs into block which contains 128 integers. when SegmentTermDocs's next method called(or read readNoTf).it decodes a block and save it to a cache. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org