alamb commented on issue #3941: URL: https://github.com/apache/arrow-datafusion/issues/3941#issuecomment-1293604612
I think we should consider reducing our hash implementation repetition prior to implementing spilling #2723 In general I think @yjshen 's algorithm for grouping in limited memory is 💯 I think we can implement it in stages, however, the first stage being tracking the current memory used and erroring when that is exceeded. Then in the second stage, rather than erroring we can implement the externalized / spilling strategy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
