Re: [PR] GH-3356: Add buffers allocated by vectored IO for releasing [parquet-java]

via GitHub Thu, 13 Nov 2025 22:47:56 -0800


gszadovszky commented on code in PR #3357:
URL: https://github.com/apache/parquet-java/pull/3357#discussion_r2525997787



##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##########
@@ -1354,7 +1382,7 @@ private void readVectored(List<ConsecutivePartList> 
allParts, ChunkListBuilder b
     }
     LOG.debug("Reading {} bytes of data with vectored IO in {} ranges", 
totalSize, ranges.size());
     // Request a vectored read;
-    f.readVectored(ranges, options.getAllocator());
+    f.readVectored(ranges, new ReleasingAllocator(options.getAllocator(), 
builder.releaser));

Review Comment:
   I'm not sure I get the concept of this PR. `ByteBufferAllocator` actually 
has the contract of releasing the `ByteBuffer` allocated by it. The only thing 
we need to do is to invoke this method at the right time when the related 
buffer is not needed anymore.
   
   The `ByteBufferReleaser` concept came into the scope only to easily postpone 
the release invocation to the time we really can release the related 
`ByteBuffer`s. (By using `BytesInput` we may pass the related buffers around 
and it is not always clear when to release them.)
   
   @annimesh2809, I would suggest you to implement a unit test to reproduce the 
issue first. You may use `TrackingByteBufferAllocator` to fail if any allocated 
buffer is not released during the execution. You may find examples of its usage 
among the unit tests. If you find the issue, you'll need to ensure that the 
related allocated buffers are get back to their allocator to release them. You 
may use the existing patterns we already have or invent new ones if necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] GH-3356: Add buffers allocated by vectored IO for releasing [parquet-java]

Reply via email to