2010YOUY01 commented on PR #16734: URL: https://github.com/apache/datafusion/pull/16734#issuecomment-3066735438
I got an alternative idea that might be simpler to implement: We have already implemented MemTable repartition in https://github.com/apache/datafusion/pull/15409, this is not working in the issue because now it only handles the case the we have 1 partition with many small batches -- it will re-distribute them evenly. However it won't slice huge batch into smaller ones. See https://github.com/apache/datafusion/blob/4dd78255f081083adb79c948613986f119c4821a/datafusion/datasource/src/memory.rs#L556-L558 Maybe we can extending the existing solution to achieve this goal 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org