berkaysynnada commented on PR #11983:
URL: https://github.com/apache/datafusion/pull/11983#issuecomment-2291031872

   The issue explained in https://github.com/apache/datafusion/issues/9792 was 
resolved with the implementation of 
https://github.com/apache/datafusion/pull/11652. This fix handles the problem 
related to waiting for the coalescer buffer to fill when a `Limit -> ... -> 
CoalesceBatches` pattern exists. The approach was to push down the limit (fetch 
+ skip) into `CoalesceBatches` and eliminate the limit when it was no longer 
needed.
   
   With https://github.com/apache/datafusion/pull/12003, it appears that 
additional corner cases are being addressed. It further refines the process by 
pushing limits as far down the execution plan as possible and removing any 
redundant limits.
   
   It seems that these recent improvements already address the objective you're 
aiming for, without the need to define a constant thresholds. I think there is 
no difference between using a limit without coalescing and using a coalesce 
that can internally handle limits.
   
   I am curious about your thoughts. Do you still see a need for additional 
optimization? If so, could you provide an example scenario or a test case that 
would help us discuss this further?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to