alamb commented on issue #899: URL: https://github.com/apache/arrow-datafusion/issues/899#issuecomment-902231127
I was kind of imagining we would have to do something like manually registering memory allocations. the `malloc_size_of` trait is a cool idea. While it would be likely be crazy complicated to do this for all allocations, I think all the built in DataFusion operators use most of their memory in intermediate RecordBatches and a potential single large structure (e.g. the hash tables in hash_join and hash_aggregate) If we captured these large sources I think that would get us most of the value -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
