[
https://issues.apache.org/jira/browse/IMPALA-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17718355#comment-17718355
]
ASF subversion and git services commented on IMPALA-11603:
----------------------------------------------------------
Commit 14698c8b99b80db7e6fd99900e32b6742bef1662 in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=14698c8b9 ]
IMPALA-11603: Build against Cloudflare ZLIB by default
Cloudflare Zlib is a fork of the Zlib codebase that
has been optimized to take advantage of CPU SIMD
instructions and other platform-specific optimizations.
It has the same license as regular Zlib. Amazon has
touted this as a major speedup over regular Zlib:
https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/
This adds the IMPALA_USE_CLOUDFLARE_ZLIB environment
variable which allows Impala to be built against
Cloudflare Zlib. This defaults to true. If set to
any other value, it will build against regular Zlib.
Cloudflare Zlib shows a clear performance benefit
over regular Zlib on TPC-H ORC/deflate benchmark:
+----------+-------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-------------------+---------+------------+------------+----------------+
| TPCH(42) | orc / def / block | 4.18 | -6.43% | 3.29 | -6.74%
|
+----------+-------------------+---------+------------+------------+----------------+
Testing:
- Ran GVO tests and exhaustive release tests
Change-Id: I82c480890726da0fa5bdc2a646022554eec181f4
Reviewed-on: http://gerrit.cloudera.org:8080/19207
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>
> Investigate using cloudflare's zlib library
> -------------------------------------------
>
> Key: IMPALA-11603
> URL: https://issues.apache.org/jira/browse/IMPALA-11603
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.2.0
> Reporter: Joe McDonnell
> Priority: Major
>
> Amazon recommends the use of cloudflare's zlib implementation at
> [https://github.com/cloudflare/zlib]
> In a blog post, they claim pretty large performance boosts over the regular
> zlib implementation:
> [https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/]
> {noformat}
> On Arm:
> Compression performance: ~90 percent faster than zlib-madler (original
> zlib).
> Decompression performance: ~52 percent faster than zlib-madler.
> On x86:
> Compression performance: ~113 percent faster than zlib-madler.
> Decompression performance: ~44 percent faster than zlib-madler.{noformat}
> The blog post is a year and a half old, so things may have changed since
> then, but it seems interesting. Amazon's guidebooks still recommend it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]