berkaysynnada commented on PR #11983: URL: https://github.com/apache/datafusion/pull/11983#issuecomment-2291031872
The issue explained in https://github.com/apache/datafusion/issues/9792 was resolved with the implementation of https://github.com/apache/datafusion/pull/11652. This fix handles the problem related to waiting for the coalescer buffer to fill when a `Limit -> ... -> CoalesceBatches` pattern exists. The approach was to push down the limit (fetch + skip) into `CoalesceBatches` and eliminate the limit when it was no longer needed. With https://github.com/apache/datafusion/pull/12003, it appears that additional corner cases are being addressed. It further refines the process by pushing limits as far down the execution plan as possible and removing any redundant limits. It seems that these recent improvements already address the objective you're aiming for, without the need to define a constant thresholds. I think there is no difference between using a limit without coalescing and using a coalesce that can internally handle limits. I am curious about your thoughts. Do you still see a need for additional optimization? If so, could you provide an example scenario or a test case that would help us discuss this further? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org