alamb commented on PR #15610: URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2807165186
> Benchmark results: (I think there is no significant regression for an extra round of re-spill, if it's running on a machine with fast SSDs) It seems to me that there is a 30% regression in performance compared to main when there is enough memory, right? > #### Result > Main (1.2G): > Q7 avg time: 8680.47 ms > > PR (1.2G): > Q7 avg time: 11808.71 ms But this PR is significantly better that it can complete with only 500M of memory Is there any way to regain the performance (maybe by choosing how many merge phases to do based on available memory rather than a fixed size)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org