Nate,

This is great. I have been looking for more up to date work on this. I'm
currently working on (Zhang 2008) and (Yan 2009). This paper and code looks
to be a goldmine. You just made my day.

Nate your always welcome to join the Hangout this and other papers may be
more interesting for you. I would be down for a few follow papers working
up to (Lemire 2012). An even more challenging than understanding the papers
is how to best incorporate it into Lucy as efficiently as we can.


Thanks,
-Dan


On Sat, Jan 12, 2013 at 10:36 AM, Nathan Kurz <[email protected]> wrote:

> Not sure if I'll be able to participate, but I'd highly recommend that
> anyone reading the Zukowski paper (which is a fine introduction)
> immediately continue on to Daniel Lemire's 2012 "Decoding billions of
> integers per second through vectorization".
>
> http://arxiv.org/pdf/1209.2137v3.pdf
>
> Lemire has written a survey of the current PFOR family of algorithms,
> showing a couple generations of improvements that have happened in the
> 6 years since Zukowski, and proposes another round of improvements on
> top of these.  It has good summaries of the other algorithms, and
> CLEAR WORKING CODE FOR EACH OF THEM!
>
> https://github.com/lemire/FastPFor
>
> The software includes benchmarking so that the different
> algorithms can easily be tested on different data sets.  It would also
> be also to add algorithms to be compared.  His is fully-vectorized
> almost branch-free implementation with improved
> exception handling provides approximately 4x faster encoding and 2x
> faster decoding than the Zukowski's original.   And this implementation is
> already under the Apache license.
>
> --nate
>
>
>
>
> On Fri, Jan 11, 2013 at 1:56 PM, Dan Markham <[email protected]> wrote:
> > The questions are on the Wiki!
> >
> >
> > The Lucy Book Club is taking a break from our book-in-progress this week
> to read a paper on integer compression techniques. One of the algorithms
> described in the paper is PFOR-DELTA (Patched Frame-Of-Reference with delta
> encoding), which is particularly suitable for inverted lists.
> >
> >
> > http://wiki.apache.org/lucy/LucyBookClub
> >
> > Enjoy,
> >
> > -Dan
>

Reply via email to