Samyak2 commented on issue #22526: URL: https://github.com/apache/datafusion/issues/22526#issuecomment-4621718342
I have another reproducer and explanation of this issue here, in case it helps - https://github.com/apache/datafusion/issues/22757 The solution we used internally for this was the alternate idea of Option 2: > Option 2. Fix memory accounting > Another idea: > extend this function to work across multiple batches and deduplicate the arrow buffers I think this will be easier to do and would also be a more general solution -- any operator that emits record batches with shared underlying buffers would be handled. I can open a draft PR doing this in hash join. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
