This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 89666d44a39 [SPARK-41912][SQL] Subquery should not validate CTE
89666d44a39 is described below

commit 89666d44a39c48df841a0102ff6f54eaeb4c6140
Author: Rui Wang <[email protected]>
AuthorDate: Fri Jan 6 11:30:48 2023 +0800

    [SPARK-41912][SQL] Subquery should not validate CTE
    
    ### What changes were proposed in this pull request?
    
    The commit https://github.com/apache/spark/pull/38029 actually intended to 
do the right thing: it checks CTE more aggressively even if a CTE is not used, 
which is ok. However, it triggers an existing issue where a subquery checks 
itself but in the CTE case if the subquery contains a CTE which is defined 
outside of the subquery, the check will fail as CTE not found (e.g. key not 
found).
    
    So it is:
    
    the commit checks more thus in the repro examples, every CTE is checked now 
(in the past only used CTE is checked).
    
    One of the CTE that is checked after the commit in the example contains 
subquery.
    
    The subquery contains another CTE which is defined outside of the subquery.
    
    The subquery checks itself thus fail due to CTE not found.
    
    This PR fixes the issue by removing the subquery self-validation on CTE 
case.
    
    ### Why are the changes needed?
    
    This fixed a regression that
    ```
        val df = sql("""
                       |    WITH
                       |    cte1 as (SELECT 1 col1),
                       |    cte2 as (SELECT (SELECT MAX(col1) FROM cte1))
                       |    SELECT * FROM cte1
                       |""".stripMargin
        )
        checkAnswer(df, Row(1) :: Nil)
    ```
    
    cannot pass analyzer anymore.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    UT
    
    Closes #39414 from amaliujia/fix_subquery_validate.
    
    Authored-by: Rui Wang <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala    |  2 +-
 .../src/test/scala/org/apache/spark/sql/SubquerySuite.scala   | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 8309186d566..4dc0bf98a54 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -923,7 +923,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
     }
 
     // Validate the subquery plan.
-    checkAnalysis(expr.plan)
+    checkAnalysis0(expr.plan)
 
     // Check if there is outer attribute that cannot be found from the plan.
     checkOuterReference(plan, expr)
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
index 3d4a629f7a9..86a0c4d1799 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
@@ -1019,6 +1019,17 @@ class SubquerySuite extends QueryTest
     }
   }
 
+  test("SPARK-41912: Subquery does not validate CTE") {
+    val df = sql("""
+                   |    WITH
+                   |    cte1 as (SELECT 1 col1),
+                   |    cte2 as (SELECT (SELECT MAX(col1) FROM cte1))
+                   |    SELECT * FROM cte1
+                   |""".stripMargin
+    )
+    checkAnswer(df, Row(1) :: Nil)
+  }
+
   test("SPARK-21835: Join in correlated subquery should be duplicateResolved: 
case 1") {
     withTable("t1") {
       withTempPath { path =>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to