dtenedor commented on code in PR #38052:
URL: https://github.com/apache/spark/pull/38052#discussion_r985959293


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala:
##########
@@ -4513,6 +4513,16 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
       }
     }
   }
+
+  test("SPARK-40618: Regression test for merging subquery bug with nested 
subqueries") {
+    // This test contains a subquery expression with another subquery 
expression nested inside.
+    // It acts as a regression test to ensure that the MergeScalarSubqueries 
rule does not attempt
+    // to merge them together.
+    withTable("t") {
+      sql("create table t(col int) using csv")
+      checkAnswer(sql("select(select sum((select sum(col) from t)) from t)"), 
Row(null))

Review Comment:
   Sounds good, done.



##########
sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala:
##########
@@ -2157,7 +2157,7 @@ class SubquerySuite extends QueryTest
     }
   }
 
-  test("Merge non-correlated scalar subqueries from different parent plans") {

Review Comment:
   @peter-toth Yeah, the bug is in `tryMergePlans` where it tries to merge a 
subquery plan with another plan that contains the original subquery plan 
inside, e.g. the regression test `select(select sum((select sum(col) from t)) 
from t`. I tried doing that by keeping sets of visited plans and checking them 
before merging, but it got complex since some `ScalarSubqueryReference`s were 
already converted.
   
   We should probably merge this PR to fix planning errors while keeping most 
of the optimizations from this rule in the short term. Then we can follow-up to 
restore the remaining optimization. I put a TODO for that; with the regression 
test present, it should be safe to proceed from there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to