[GitHub] [spark] YannisSismanis commented on a diff in pull request #41763: [SPARK-44219][SQL] Adds extra per-rule validations for optimization rewrites.

via GitHub Tue, 15 Aug 2023 12:30:32 -0700


YannisSismanis commented on code in PR #41763:
URL: https://github.com/apache/spark/pull/41763#discussion_r1294991949



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala:
##########
@@ -324,6 +325,124 @@ object LogicalPlanIntegrity {
       LogicalPlanIntegrity.hasUniqueExprIdsForOutput(plan))
   }
 
+  /**
+   * This method validates there are no dangling attribute references.
+   * Returns an error message if the check does not pass, or None if it does 
pass.
+   */
+  def validateNoDanglingReferences(plan: LogicalPlan): Option[String] = {

Review Comment:
   just using missingInput fails one of the tests in this PR:
   " Optimizer per rule validation catches dangling references *** FAILED *** 
(497 milliseconds)
   [info]   "[PLAN_VALIDATION_FAILED_RULE_IN_BATCH] Rule 
org.apache.spark.sql.catalyst.optimizer.OptimizerSuite$DanglingReference$1 in 
batch test generated an invalid plan: Aliases debug1#2, debug2#3 are dangling 
in the references for plan:
   [info]    !Project [(debug1#2 + debug2#3) AS attr#4]
   [info]   +- Project [10 AS attr#1]
   [info]      +- OneRowRelation
   [info]   
   [info]   Previous schema:attr#1
   [info]   Previous plan: Project [10 AS attr#1]
   [info]   +- OneRowRelation
   [info]   " did not contain "debug1, debug2 are dangling" 
(OptimizerSuite.scala:103)"
   I guess one can make it work with missingInput to check both-ways (i.e. all 
the dependencies are satisfied and all the provided columns are used, but it 
would be more verbose



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] YannisSismanis commented on a diff in pull request #41763: [SPARK-44219][SQL] Adds extra per-rule validations for optimization rewrites.

Reply via email to