Kontinuation commented on code in PR #5735:
URL: https://github.com/apache/iceberg/pull/5735#discussion_r967570968
##########
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java:
##########
@@ -210,6 +210,9 @@ public VectorHolder read(VectorHolder reuse, int
numValsToRead) {
}
private void allocateFieldVector(boolean dictionaryEncodedVector) {
+ if (vec != null) {
+ vec.close();
+ }
Review Comment:
This is the fix for memory leaks when reading parquet files containing
interleaving plain/dictionary pages.
`VectorizedArrowReader.read` allocates arrow vectors when encoding of pages
in currently reading row group changes, it does not close the previously
allocated vector before allocating new vector, which causes memory leaks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]