[GitHub] [spark] allisonwang-db opened a new pull request #32111: [SPARK-28379][SQL] Allow non-aggregated single row correlated scalar subquery

GitBox Fri, 09 Apr 2021 11:23:32 -0700


allisonwang-db opened a new pull request #32111:
URL: https://github.com/apache/spark/pull/32111



   ### What changes were proposed in this pull request?
   This PR allows non-aggregated correlated scalar subquery if the max output 
row is less than 2. Correlated scalar subqueries need to be aggregated because 
they are going to be decorrelated and rewritten as LEFT OUTER joins. If the 
correlated scalar subquery produces more than one output row, the rewrite will 
yield wrong results. 
   
   But this constraint can be relaxed when the subquery plan's the max number 
of output rows is less than or equal to 1. 
   
   ### Why are the changes needed?
   To relax a constraint in CheckAnalysis for the correlated scalar subquery.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes
   
   ### How was this patch tested?
   Unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] allisonwang-db opened a new pull request #32111: [SPARK-28379][SQL] Allow non-aggregated single row correlated scalar subquery

Reply via email to