[
https://issues.apache.org/jira/browse/IMPALA-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317481#comment-17317481
]
ASF subversion and git services commented on IMPALA-9997:
---------------------------------------------------------
Commit d7cc510c95c4850190ca02ae1397aef95cde3d98 in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d7cc510 ]
IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
This updates several compression libraries to their latest versions:
- Bzip2 1.0.8
- LZ4 1.9.3
- Snappy 1.1.8
- Zlib 1.2.11
- ZStd 1.4.9
Several of these claim minor performance improvements.
Testing:
- Ran release exhaustive job and debug core job
- Ran TPC-H scale 42 with Parquet/Snappy and Parquet/ZSTD.
(ZSTD tests ran with default compression level.)
Parquet/Snappy was unchanged. Parquet/ZSTD improved:
+----------+------------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+------------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / zstd / block | 8.50 | -2.10% | 5.46 |
-2.63% |
+----------+------------------------+---------+------------+------------+----------------+
Change-Id: I858f82f773023bd0aea14543f18bd74071758468
Reviewed-on: http://gerrit.cloudera.org:8080/17254
Reviewed-by: Joe McDonnell <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Update to a newer version of LZ4
> --------------------------------
>
> Key: IMPALA-9997
> URL: https://issues.apache.org/jira/browse/IMPALA-9997
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Priority: Major
> Labels: native-toolchain
>
> Impala currently uses LZ4 version 1.7.5. The LZ4 project lists several
> performance improvements in later versions:
>
> {noformat}
> v1.9.0
> perf: large decompression speed improvement on x86/x64 (up to +20%) by
> @djwatson
> ...
> v1.8.3
> perf: minor decompression speed improvement (~+2%) with gcc
> ...
> v1.8.2
> perf: *much* faster dictionary compression on small files, by @felixhandte
> perf: improved decompression speed and binary size, by Alexey Tourbin (@svpv)
> perf: slightly faster HC compression and decompression speed
> perf: very small compression ratio improvement
> ...
> v1.8.1
> perf : faster and stronger ultra modes (levels 10+)
> perf : slightly faster compression and decompression speed
> perf : fix bad degenerative case, reported by @c-morgenstern
> ...{noformat}
> [https://github.com/lz4/lz4/blob/dev/NEWS]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]