[PR] perf: Implement physical execution of uncorrelated scalar subqueries [datafusion]

via GitHub Sun, 29 Mar 2026 11:57:35 -0700


neilconway opened a new pull request, #21240:
URL: https://github.com/apache/datafusion/pull/21240


   ## Which issue does this PR close?
   
   - Closes #3781.
   - Closes #18181.
   
   ## Rationale for this change
   
   Previously, DataFusion evaluated uncorrelated scalar subqueries by 
transforming them into joins. This has two shortcomings:
   
   1. Scalar subqueries that return > 1 row were allowed, producing incorrect 
query results. Such queries should instead result in a runtime error.
   2. Performance. Evaluating scalar subqueries as a join requires going 
through the join machinery. More importantly, it means that UDFs that have 
special-cases for scalar inputs cannot use those code paths for scalar 
subqueries, which often results in significantly slower query execution.
   
   This PR introduces physical execution of uncorrelated scalar subqueries:
   
   * Uncorrelated subqueries are left in the plan by the logical planner, not 
rewritten into joins
   * The physical planner collects uncorrelated scalar subqueries, and plans 
them recursively (supporting nested subqueries). We add a `ScalarSubqueryExec` 
plan node to the top of any physical plan with uncorrelated subqueries: it has 
n+1 children, n subqueries and its "main" input, which is the rest of the query 
plan. The subquery expression is replaced with a `ScalarSubqueryExpr`.
   * `ScalarSubqueryExec` manages the execution of the subqueries and stashes 
the result in a shared "results container", which is an 
`Arc<Vec<OnceLock<ScalarValue>>>`. At present, subquery evaluation is done 
sequentially and not overlapped with evaluation of the parent query.
   * When `ScalarSubqueryExpr` is evaluated, it fetches the result of the 
subquery from the result container.
   
   This architecture makes it easy to avoid the two shortcomings described 
above. Performance seems roughly unchanged (benchmarks added in this PR), but 
in situations like #18181, we can now leverage scalar fast-paths; in the case 
of #18181, this improves performance from ~800 ms to ~30 ms.
   
   ## What changes are included in this PR?
   
   * Benchmarks
   * Modify subquery rewriter to not transform subqueries -> joins
   * Collect and plan uncorrelated scalar subqueries in the physical planner, 
and wire up `ScalarSubqueryExpr`
   * Support for proto serialization/deserialization using 
`PhysicalProtoConverterExtension` to wire up `ScalarSubqueryExpr` correctly
   * Add various SLT tests and update expected plan shapes for some tests
   
   ## Are these changes tested?
   
   Yes.
   
   ## Are there any user-facing changes?
   
   Not really. Scalar subqueries that returned > 1 row will now be rejected 
instead of producing incorrect query results.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] perf: Implement physical execution of uncorrelated scalar subqueries [datafusion]

Reply via email to