(spark) branch master updated: [SPARK-45892][SQL] Refactor optimizer plan validation to decouple `validateSchemaOutput` and `validateExprIdUniqueness`

wenchen Mon, 13 Nov 2023 13:29:59 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 7cea52c96f5b [SPARK-45892][SQL] Refactor optimizer plan validation to 
decouple `validateSchemaOutput` and `validateExprIdUniqueness`
7cea52c96f5b is described below

commit 7cea52c96f5be1bc565a033bfd77370ab5527a35
Author: Xi Liang <xi.li...@databricks.com>
AuthorDate: Tue Nov 14 05:28:07 2023 +0800

    [SPARK-45892][SQL] Refactor optimizer plan validation to decouple 
`validateSchemaOutput` and `validateExprIdUniqueness`
    
    ### What changes were proposed in this pull request?
    
    Currently, the expressionIDUniquenessValidation is [closely coupled with 
outputSchemaValidation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L403C7-L411C8).
    
    This PR refactors the code to improve readability and maintainability.
    
    ### Why are the changes needed?
    
    Improve code readability and maintainability.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing tests.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #43761 from xil-db/SPARK-45892-validation-refactor.
    
    Authored-by: Xi Liang <xi.li...@databricks.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 .../sql/catalyst/plans/logical/LogicalPlan.scala    | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
index ae3029b279da..cce385e8d9d1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
@@ -381,6 +381,15 @@ object LogicalPlanIntegrity {
     }.flatten
   }
 
+  def validateSchemaOutput(previousPlan: LogicalPlan, currentPlan: 
LogicalPlan): Option[String] = {
+    if (!DataTypeUtils.equalsIgnoreNullability(previousPlan.schema, 
currentPlan.schema)) {
+      Some(s"The plan output schema has changed from 
${previousPlan.schema.sql} to " +
+        currentPlan.schema.sql + s". The previous plan: 
${previousPlan.treeString}\nThe new " +
+        "plan:\n" + currentPlan.treeString)
+    } else {
+      None
+    }
+  }
 
   /**
    * Validate the structural integrity of an optimized plan.
@@ -400,17 +409,11 @@ object LogicalPlanIntegrity {
     } else if 
(currentPlan.exists(PlanHelper.specialExpressionsInUnsupportedOperator(_).nonEmpty))
 {
       Some("Special expressions are placed in the wrong plan: " + 
currentPlan.treeString)
     } else {
-      LogicalPlanIntegrity.validateExprIdUniqueness(currentPlan).orElse {
-        if (!DataTypeUtils.equalsIgnoreNullability(previousPlan.schema, 
currentPlan.schema)) {
-          Some(s"The plan output schema has changed from 
${previousPlan.schema.sql} to " +
-            currentPlan.schema.sql + s". The previous plan: 
${previousPlan.treeString}\nThe new " +
-            "plan:\n" + currentPlan.treeString)
-        } else {
-          None
-        }
-      }
+      None
     }
     validation = validation
+      .orElse(LogicalPlanIntegrity.validateExprIdUniqueness(currentPlan))
+      .orElse(LogicalPlanIntegrity.validateSchemaOutput(previousPlan, 
currentPlan))
       .orElse(LogicalPlanIntegrity.validateNoDanglingReferences(currentPlan))
       .orElse(LogicalPlanIntegrity.validateGroupByTypes(currentPlan))
       .orElse(LogicalPlanIntegrity.validateAggregateExpressions(currentPlan))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-45892][SQL] Refactor optimizer plan validation to decouple `validateSchemaOutput` and `validateExprIdUniqueness`

Reply via email to