Re: [PR] feat: optimize CoalesceBatches in limit [datafusion]

via GitHub Thu, 15 Aug 2024 03:34:38 -0700


acking-you commented on PR #11983:
URL: https://github.com/apache/datafusion/pull/11983#issuecomment-2291044316


   > The issue explained in #9792 was resolved with the implementation of 
#11652. This fix handles the problem related to waiting for the coalescer 
buffer to fill when a `Limit -> ... -> CoalesceBatches` pattern exists. The 
approach was to push down the limit (fetch + skip) into `CoalesceBatches` and 
eliminate the limit when it was no longer needed.
   > 
   > With #12003, it appears that additional corner cases are being addressed. 
It further refines the process by pushing limits as far down the execution plan 
as possible and removing any redundant limits.
   > 
   > It seems that these recent improvements already address the objective 
you're aiming for, without the need to define a constant thresholds. I think 
there is no difference between using a limit without coalescing and using a 
coalesce that can internally handle limits.
   > 
   > I am curious about your thoughts. Do you still see a need for additional 
optimization? If so, could you provide an example scenario or a test case that 
would help us discuss this further?
   
   Thanks for providing the background on this optimization. I looked into the 
issues you mentioned and it seems they've been resolved exactly as I hoped. 
Great job! I'll reference the information you compiled in [my 
issue](https://github.com/apache/datafusion/issues/11980).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: optimize CoalesceBatches in limit [datafusion]

Reply via email to