[GitHub] [arrow] icexelloss commented on issue #37630: [C++] Potential memory leak in Parquet reading with Dataset

via GitHub Fri, 08 Sep 2023 10:46:40 -0700


icexelloss commented on issue #37630:
URL: https://github.com/apache/arrow/issues/37630#issuecomment-1712025049


   > Note that FileFragment in the datasets API caches the parquet metadata 
(with no option to disable this at the moment). So if you are scanning many 
files you will see memory grow over the lifetime of the scan as more and more 
metadatas are cached. I would expect a second scan would not grow the memory.
   
   Thanks @westonpace, can you give a pointer to where that is happening?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] icexelloss commented on issue #37630: [C++] Potential memory leak in Parquet reading with Dataset

Reply via email to