hudi-agent commented on code in PR #18818:
URL: https://github.com/apache/hudi/pull/18818#discussion_r3288466908


##########
hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java:
##########
@@ -408,29 +409,57 @@ private static class CompressionHandler implements 
Serializable {
     private final ByteArrayOutputStream compressBaos;
     private final ByteArrayOutputStream decompressBaos;
     private final byte[] decompressIntermediateBuffer;
+    // Reusable zlib engines. Each CompressionHandler is held in a ThreadLocal,
+    // so a single Deflater/Inflater pair per worker thread is sufficient and
+    // avoids per-call construction. On JDK 8 every new Deflater/Inflater
+    // registers a finalizer; under concurrent disk-map traffic the Finalizer
+    // thread cannot drain the queue, pinning native zlib handles in old gen
+    // and driving the JVM into a GC death spiral.
+    private transient Deflater deflater;
+    private transient Inflater inflater;
 
     CompressionHandler() {
       compressBaos = new 
ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
       decompressBaos = new 
ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
       decompressIntermediateBuffer = new 
byte[DECOMPRESS_INTERMEDIATE_BUFFER_SIZE];
     }
 
+    // Lazy accessors so the handler stays usable after Java deserialization
+    // (transient fields come back null).
+    private Deflater deflater() {

Review Comment:
   🤖 nit: having `deflater()` / `inflater()` share names with their backing 
fields forces the callers to reach for `def` and `inf` as local variable names 
(line 445 / 458) — and `def` is a keyword in Groovy, Kotlin, and Python, which 
can cause a mental stumble. Could you rename the accessors to `getDeflater()` / 
`getInflater()`? That would let callers write `Deflater deflater = 
getDeflater()` cleanly.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to