[GitHub] [arrow] westonpace commented on issue #37630: [C++] Potential memory leak in Parquet reading with Dataset

via GitHub Fri, 08 Sep 2023 10:24:22 -0700


westonpace commented on issue #37630:
URL: https://github.com/apache/arrow/issues/37630#issuecomment-1712000301


   Note that FileFragment in the datasets API caches the parquet metadata (with 
no option to disable this at the moment).  So if you are scanning many files 
you will see memory grow over the lifetime of the scan as more and more 
metadatas are cached.  I would expect a second scan would not grow the memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #37630: [C++] Potential memory leak in Parquet reading with Dataset

Reply via email to