HyukjinKwon commented on a change in pull request #34883:
URL: https://github.com/apache/spark/pull/34883#discussion_r769160607



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
##########
@@ -222,6 +222,34 @@ abstract class Expression extends TreeNode[Expression] {
    */
   def childrenResolved: Boolean = children.forall(_.resolved)
 
+  // Expression canonicalization is done in 2 phases:
+  //   1. Recursively canonicalize each node in the expression tree. This does 
not change the tree
+  //      structure and is more like "node-local" canonicalization.
+  //   2. Find adjacent commutative operators in the expression tree, reorder 
them to get a
+  //      static order and remove cosmetic variations. This may change the 
tree structure
+  //      dramatically and is more like a "global" canonicalization.
+  //
+  // The first phase is done by `preCanonicalized`. It's a `lazy val` which 
recursively calls
+  // `preCanonicalized` on the children. This means that almost every node in 
the expression tree
+  // will instantiate the `preCanonicalized` variable, which is good for 
performance as you can
+  // reuse the canonicalization result of the children when you construct a 
new expression node.
+  //
+  // The second phase is done by `canonicalized`, which simply calls 
`Canonicalize` and is kind of
+  // the actual "user-facing API" of expression canonicalization. Only the 
root node of the
+  // expression tree will instantiate the `canonicalized` variable. This is 
different from
+  // `preCanonicalized`, because `canonicalized` does "global" 
canonicalization and most of the time
+  // you cannot reuse the canonicalization result of the children.
+
+  /**
+   * An internal lazy val to implement expression canonicalization. It should 
only be called in
+   * `canonicalized`, or in subclass's `preCanonicalized` when the subclass 
overrides this lazy val
+   * to provide custom canonicalization logic.
+   */
+  lazy val preCanonicalized: Expression = {

Review comment:
       Shall we explicitly make it `protected` since this isn't supposed to 
call anywhere outside but only within `canonicalized`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to