cloud-fan commented on code in PR #34929:
URL: https://github.com/apache/spark/pull/34929#discussion_r852309186


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CTESubstitution.scala:
##########
@@ -211,8 +212,33 @@ object CTESubstitution extends Rule[LogicalPlan] {
       } else {
         // A CTE definition might contain an inner CTE that has a higher 
priority, so traverse and
         // substitute CTE defined in `relation` first.
+        // NOTE: we must call `traverseAndSubstituteCTE` before 
`substituteCTE`, as the relations
+        // in the inner CTE have higher priority over the relations in the 
outer CTE when resolving
+        // inner CTE relations. For example:
+        // WITH t1 AS (SELECT 1)
+        // t2 AS (
+        //   WITH t1 AS (SELECT 2)
+        //   WITH t3 AS (SELECT * FROM t1)
+        // )
+        // t3 should resolve the t1 to `SELECT 2` instead of `SELECT 1`.
         traverseAndSubstituteCTE(relation, isCommand, cteDefs)._1
       }
+
+      if (cteDefs.length > lastCTEDefCount) {

Review Comment:
   This PR will be merged to 3.2 because of the fixes for the correctness bugs 
and performance regressions. It doesn't really improve the performance compared 
to Spark 3.1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to