Re: [PR] GH-3356: Add buffers allocated by vectored IO for releasing [parquet-java]

via GitHub Tue, 18 Nov 2025 00:28:02 -0800


gszadovszky commented on code in PR #3357:
URL: https://github.com/apache/parquet-java/pull/3357#discussion_r2536826920



##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##########
@@ -1354,7 +1382,7 @@ private void readVectored(List<ConsecutivePartList> 
allParts, ChunkListBuilder b
     }
     LOG.debug("Reading {} bytes of data with vectored IO in {} ranges", 
totalSize, ranges.size());
     // Request a vectored read;
-    f.readVectored(ranges, options.getAllocator());
+    f.readVectored(ranges, new ReleasingAllocator(options.getAllocator(), 
builder.releaser));

Review Comment:
   Thanks for the context, @annimesh2809.
   Why do you need to track the allocated buffers to be released later instead 
of simply giving the `allocate` and `release` methods of the 
`ByteBufferAllocator` instance to the related Hadoop API via the 
implementations of `SeekableInputStream.readVectored`? I assume the Hadoop code 
would release the allocated buffers as soon as they are not needed anymore.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] GH-3356: Add buffers allocated by vectored IO for releasing [parquet-java]

Reply via email to