alamb commented on issue #6937: URL: https://github.com/apache/arrow-datafusion/issues/6937#issuecomment-1634271850
> it also seems hash partitioning is not really fast ATM) Yes -- here are some ideas to improve things: 1. Reuse the hash values (already computed for the Partial group bys) 2. Avoid the extra copy with `CoalesceBatches` -- today we "take" in in repartiton (which copies things) and then `CoealsceBatches` concat's them again with another copy) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
