[ 
https://issues.apache.org/jira/browse/IMPALA-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317481#comment-17317481
 ] 

ASF subversion and git services commented on IMPALA-9997:
---------------------------------------------------------

Commit d7cc510c95c4850190ca02ae1397aef95cde3d98 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d7cc510 ]

IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions

This updates several compression libraries to their latest versions:
 - Bzip2 1.0.8
 - LZ4 1.9.3
 - Snappy 1.1.8
 - Zlib 1.2.11
 - ZStd 1.4.9
Several of these claim minor performance improvements.

Testing:
 - Ran release exhaustive job and debug core job
 - Ran TPC-H scale 42 with Parquet/Snappy and Parquet/ZSTD.
   (ZSTD tests ran with default compression level.)
   Parquet/Snappy was unchanged. Parquet/ZSTD improved:

+----------+------------------------+---------+------------+------------+----------------+
| Workload | File Format            | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+------------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / zstd / block | 8.50    | -2.10%     | 5.46       | 
-2.63%         |
+----------+------------------------+---------+------------+------------+----------------+

Change-Id: I858f82f773023bd0aea14543f18bd74071758468
Reviewed-on: http://gerrit.cloudera.org:8080/17254
Reviewed-by: Joe McDonnell <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Update to a newer version of LZ4
> --------------------------------
>
>                 Key: IMPALA-9997
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9997
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Joe McDonnell
>            Priority: Major
>              Labels: native-toolchain
>
> Impala currently uses LZ4 version 1.7.5. The LZ4 project lists several 
> performance improvements in later versions:
>  
> {noformat}
> v1.9.0
> perf: large decompression speed improvement on x86/x64 (up to +20%) by 
> @djwatson
> ...
> v1.8.3
> perf: minor decompression speed improvement (~+2%) with gcc
> ...
> v1.8.2
> perf: *much* faster dictionary compression on small files, by @felixhandte
> perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv)
> perf: slightly faster HC compression and decompression speed
> perf: very small compression ratio improvement
> ...
> v1.8.1
> perf : faster and stronger ultra modes (levels 10+)
> perf : slightly faster compression and decompression speed
> perf : fix bad degenerative case, reported by @c-morgenstern
> ...{noformat}
> [https://github.com/lz4/lz4/blob/dev/NEWS]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to