Re: [PR] perf: Implement physical execution of uncorrelated scalar subqueries [datafusion]

via GitHub Thu, 02 Apr 2026 06:42:54 -0700


neilconway commented on PR #21240:
URL: https://github.com/apache/datafusion/pull/21240#issuecomment-4177985136


   > But I think this is not what we want ideally - we want to run few 
independent pipelines as possible, and get (data) parallelism from the 
individual pipelines rather than executing all at the same time.
   
   I don't disagree 😊 But for the purposes of this PR, we will regress 
performance on some benchmark queries if we don't do some additional work to 
get the same degree of overlapping that the cross-join path gets today. Is that 
something we're okay with?
   
   I don't think the additional complexity to overlap subquery evaluation with 
main query evaluation is too bad (via `WaitForSubqueryExec`), but if we're 
going to land morsel-driven parallelism soon-ish (🎉🎉🎉), maybe that will solve 
this problem in a cleaner / more general way and we can keep the subquery eval 
stuff simpler. Let me know what you think @Dandandan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] perf: Implement physical execution of uncorrelated scalar subqueries [datafusion]

Reply via email to