comphead commented on issue #9359: URL: https://github.com/apache/arrow-datafusion/issues/9359#issuecomment-2032482285
After reading some code and already opened issues on the same topic, probably its possible to summarize whats needed for POC at least: - try to use external mem sorter to sort streamed and buffered batches for the sort phases - MemoryReservation is already in SMJ impl and for merge phase it uses the try_grow on buffered, so naive approach is to spill buffered and read back lesser chunks from disk, we can play with that - run test query and try to profile memory to see other places where spill can be useful -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
