hudi-agent commented on code in PR #18818:
URL: https://github.com/apache/hudi/pull/18818#discussion_r3288466908
##########
hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java:
##########
@@ -408,29 +409,57 @@ private static class CompressionHandler implements
Serializable {
private final ByteArrayOutputStream compressBaos;
private final ByteArrayOutputStream decompressBaos;
private final byte[] decompressIntermediateBuffer;
+ // Reusable zlib engines. Each CompressionHandler is held in a ThreadLocal,
+ // so a single Deflater/Inflater pair per worker thread is sufficient and
+ // avoids per-call construction. On JDK 8 every new Deflater/Inflater
+ // registers a finalizer; under concurrent disk-map traffic the Finalizer
+ // thread cannot drain the queue, pinning native zlib handles in old gen
+ // and driving the JVM into a GC death spiral.
+ private transient Deflater deflater;
+ private transient Inflater inflater;
CompressionHandler() {
compressBaos = new
ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
decompressBaos = new
ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
decompressIntermediateBuffer = new
byte[DECOMPRESS_INTERMEDIATE_BUFFER_SIZE];
}
+ // Lazy accessors so the handler stays usable after Java deserialization
+ // (transient fields come back null).
+ private Deflater deflater() {
Review Comment:
🤖 nit: having `deflater()` / `inflater()` share names with their backing
fields forces the callers to reach for `def` and `inf` as local variable names
(line 445 / 458) — and `def` is a keyword in Groovy, Kotlin, and Python, which
can cause a mental stumble. Could you rename the accessors to `getDeflater()` /
`getInflater()`? That would let callers write `Deflater deflater =
getDeflater()` cleanly.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]