iemejia opened a new pull request, #3579:
URL: https://github.com/apache/parquet-java/pull/3579

   ## Summary
   
   Make the build forward-compatible with JDK 21+ (tested up to JDK 25).
   
   ### Hadoop upgrade (3.3.0 -> 3.4.3)
   
   Hadoop 3.3.x uses `javax.security.auth.Subject.getSubject()` which was 
removed in JDK 23+ (JEP 471). Hadoop 3.4.x uses `Subject.current()` instead, 
restoring compatibility.
   
   ### Spotless upgrade (2.46.1 -> 3.5.1)
   
   Spotless 2.46.1 calls `com.sun.tools.javac.util.Log` methods that were 
removed in JDK 25, causing `NoSuchMethodError` during formatting. Spotless 
3.5.1 is compatible. The minor whitespace changes to switch/case comment 
indentation are from the new formatter version.
   
   ### Fix ByteBuffer leak in vectored I/O reads
   
   Hadoop's `readVectored()` API accepts an `IntFunction<ByteBuffer>` for 
allocation but has no corresponding release callback. When `ChecksumFileSystem` 
is in the path (the default for `LocalFileSystem`), Hadoop allocates buffers 
through the caller's allocator for internal checksum verification, then 
abandons them without release. The caller never sees these buffers -- only 
sliced views of the verified data are returned through the futures.
   
   This caused `TrackingByteBufferAllocator` (used in tests) to throw 
`LeakedByteBufferException` for 45+ tests using the vectored I/O path:
   - `TestRecordLevelFilters` (15 tests)
   - `TestColumnIndexFiltering` (24 tests)
   - `TestParquetReader` (6+ tests)
   
   **Fix:** Wrap the allocator in a capturing decorator that tracks every 
buffer allocated during `readVectored()`, then registers them all for release 
via `ByteBufferReleaser`. A `try-finally` ensures buffers are registered even 
if a read future times out or fails.
   
   This is a workaround for 
[HADOOP-19901](https://issues.apache.org/jira/browse/HADOOP-19901), which 
tracks the upstream bug.
   
   ### Testing
   
   - Full build passes: 4,988 tests, 0 failures, 0 errors (JDK 25, Hadoop 3.4.3)
   - Validated on JDK 17, 21, and 25
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to