Martijn Visser created FLINK-40054:
--------------------------------------
Summary: Update aircompressor to 2.0.3
Key: FLINK-40054
URL: https://issues.apache.org/jira/browse/FLINK-40054
Project: Flink
Issue Type: Technical Debt
Components: Runtime / Network
Reporter: Martijn Visser
Flink bundles io.airlift:aircompressor 0.27 in flink-runtime (shaded and
relocated), where it provides the LZO and ZSTD codecs for network shuffle
buffer compression (taskmanager.network.compression.codec). The LZ4 codec uses
lz4-java and is unaffected.
aircompressor 2.0.3 is the latest release of the Java-8-compatible maintenance
line (branch release-2.x). It is 0.27 plus backported fixes for CVE-2025-67721,
an uninitialized-memory data leak in the Snappy and LZ4 decompressors when the
match offset is zero. Flink's code path only uses aircompressor's LZO and ZSTD
implementations, so the vulnerable decompressors are bundled but not invoked;
the upgrade is dependency hygiene and removes security-scanner findings. The
packages (io.airlift.compress.*), API and bytecode level (Java 8) are
unchanged, making this a drop-in replacement.
Why not aircompressor-v3: the actively developed line was renamed to the
io.airlift:aircompressor-v3 artifact with packages under
io.airlift.compress.v3. Its latest release (3.6) is compiled to Java 25
bytecode (earlier 3.x releases required Java 22), while Flink still compiles at
source level Java 11 and supports Java 11, 17 and 21 at runtime. The package
rename would additionally require code changes. v3 is therefore not an option
until Flink drops support for pre-22 JVMs.
Alternatives considered:
* Replacing the ZSTD codec with com.github.luben:zstd-jni (native libzstd via
JNI, as used by Kafka/Spark/Parquet) would likely be faster, but swapping pure
Java for native libraries in the shuffle hot path is a performance-motivated
change that needs benchmarks and its own ticket.
* The LZO codec has no alternative: native liblzo2 and hadoop-lzo are
GPL-licensed, and aircompressor's LZO is the only Apache-2.0-licensed Java
implementation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)