Yeah, next steps are to look at decompression speeds and do a more thorough
comparison between brotli compression levels and zstd levels. This initial
set of data is just to make sure that data produced by Parquet works well
with the compression codec because a significant number of the columns are
dictionary-encoded before applying the generic codec. Tables three and four
are the cases that exercise this the most, and they do really well with
zstd and brotli.

On Thu, Sep 28, 2017 at 3:51 PM, Tim Armstrong <[email protected]>
wrote:

> Thanks for all the work you've done on benchmarking here, seems like it
> could be a big improvement. I can't seem to find decompression numbers in
> your spreadsheet. I think those should be where some of these newer codecs
> really shine. E.g. zstd's own numbers look really impressive:
> http://facebook.github.io/zstd/
>



-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to