Nice! We should do this. Last time we upgraded it was an easy perf win - we snappy compress parquet by default and spend significant time decompressing in scans.
I filed a JIRA: https://issues.cloudera.org/browse/IMPALA-4846 Is anyone interested in picking this up? It would require adding a new version of snappy to the native-toolchain, then bumping the version in Impala. Good way to learn about how we handle third-party dependencies. - Tim On Tue, Jan 31, 2017 at 10:28 AM, Todd Lipcon <[email protected]> wrote: > I seem to recall that Impala uses snappy as the default codec for a lot of > compression/decompression. May be worth upgrading to the latest release > which claims a 20% improvement in decompression performance: > > https://github.com/google/snappy/blob/master/NEWS > > (just submitted a code review to Kudu to do the same, though we use LZ4 > more than snappy) > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera >
