lz4 is at least 2x faster than Snappy with comparable compression. BLOCK_ENCODING make sense only if Keys are ~ Values (time-series type of data) as since it compresses only keys.
On Wed, Sep 11, 2013 at 1:22 PM, Elliott Clark <[email protected]> wrote: > To make things even more interesting I've been testing lz4 recently > and it's been doing very well on my ycsb runs. So there's another > option to add. > > On Wed, Sep 11, 2013 at 12:10 PM, Nick Dimiduk <[email protected]> wrote: > > Do we have a consolidated resource with information and recommendations > > about use of the above? For instance, I ran a simple test using > > PerformanceEvaluation, examining just the size of data on disk for 1G of > > input data. The matrix below has some surprising results: > > > > +--------------------+--------------+ > > | MODIFIER | SIZE (bytes) | > > +--------------------+--------------+ > > | none | 1108553612 | > > +--------------------+--------------+ > > | compression:SNAPPY | 427335534 | > > +--------------------+--------------+ > > | compression:LZO | 270422088 | > > +--------------------+--------------+ > > | compression:GZ | 152899297 | > > +--------------------+--------------+ > > | codec:PREFIX | 1993910969 | > > +--------------------+--------------+ > > | codec:DIFF | 1960970083 | > > +--------------------+--------------+ > > | codec:FAST_DIFF | 1061374722 | > > +--------------------+--------------+ > > | codec:PREFIX_TREE | 1066586604 | > > +--------------------+--------------+ > > > > Where does a wayward soul look for guidance on which combination of the > > above to choose for their application? > > > > Thanks, > > Nick >
