timothydijamco commented on issue #45287: URL: https://github.com/apache/arrow/issues/45287#issuecomment-2619986163
Adding `physical_schema_.reset()` to the `ClearCachedMetadata()` method (from #45330) seems to reduce memory usage a bit further - Without `ClearCachedMetadata()`: 9.58GiB - With `ClearCachedMetadata()` enabled but without `physical_schema_.reset()`: 2.81GiB (30% of original memory usage) - With `ClearCachedMetadata()` enabled and with `physical_schema_.reset()`: 1.73GiB (18% of original memory usage) (This is using a C++ repro scanning a dataset with 250 files, 10k columns, 200-character-long column names; and one scan. Will share the new C++ code I used for generating this test data and getting these results tomorrow after I clean it up a bit in case it's useful) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
