danny0405 commented on a change in pull request #2016:
URL: https://github.com/apache/calcite/pull/2016#discussion_r439692310



##########
File path: core/src/main/java/org/apache/calcite/plan/Digest.java
##########
@@ -0,0 +1,269 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.plan;
+
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.plan.volcano.RelSubset;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.hint.Hintable;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.util.Pair;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * A short description of relational expression's type, inputs, and
+ * other properties. The digest uniquely identifies the node; another node
+ * is equivalent if and only if it has the same value.
+ *
+ * <p>Row type is part of the digest for the rare occasion that similar
+ * expressions have different types, e.g. variants of
+ * {@code Project(child=rel#1, a=null)} where a is a null INTEGER or a
+ * null VARCHAR(10). Row type is represented as fieldTypes only, so {@code 
RelNode}
+ * that differ with field names only are treated equal.
+ * For instance, {@code Project(input=rel#1,empid=$0)} and {@code 
Project(input=rel#1,deptno=$0)}
+ * are equal.
+ *
+ * <p>Computed by {@code org.apache.calcite.rel.AbstractRelNode#computeDigest},
+ * assigned by {@link org.apache.calcite.rel.AbstractRelNode#onRegister},
+ * returned by {@link org.apache.calcite.rel.AbstractRelNode#getDigest()}.
+ */
+public class Digest implements Comparable<Digest> {
+
+  //~ Instance fields --------------------------------------------------------
+
+  final int hashCode;
+  final List<Pair<String, Object>> items;
+  private final RelNode rel;
+
+  // Used for debugging, computed lazily.
+  private String digest = null;
+
+  //~ Constructors -----------------------------------------------------------
+
+  /**
+   * Creates a digest with given rel and properties.
+   *
+   * @param rel   The rel
+   * @param items The properties, e.g. the inputs, the type, the traits and so 
on
+   */
+  private Digest(RelNode rel, List<Pair<String, Object>> items) {
+    this.rel = rel;
+    this.items = normalizeContents(items);
+    this.hashCode = computeIdentity(rel, this.items);
+  }
+
+  /**
+   * Creates a digest with given rel, the digest is computed as simple,
+   * see {@link #simpleRelDigest(RelNode)}.
+   */
+  private Digest(RelNode rel) {
+    this(rel, simpleRelDigest(rel));
+  }
+
+  /** Creates a digest with given rel and string format digest. */
+  private Digest(RelNode rel, String digest) {
+    this.rel = rel;
+    this.items = Collections.emptyList();
+    this.digest = digest;
+    this.hashCode = this.digest.hashCode();
+  }
+
+  /** Returns the identity of this digest which is used to speedup hashCode 
and equals. */
+  private static int computeIdentity(RelNode rel, List<Pair<String, Object>> 
contents) {
+    return Objects.hash(collect(rel, contents, false));
+  }
+
+  /**
+   * Collects the items used for {@link #hashCode} and {@link #equals}.
+   *
+   * <p>Generally, the items used for hashCode and equals should be the same. 
The exception
+   * is the row type of the relational expression: the row type is needed 
because during
+   * planning, new equivalent rels may be produced with changed fields 
nullability
+   * (i.e. most of them comes from the rex simplify or constant reduction).
+   * This expects to be rare case, so the hashcode is computed without row type
+   * but when it conflicts, we compare with the row type involved(sans field 
names).
+   *
+   * @param rel      The rel to compute digest
+   * @param contents The rel properties should be considered in digest
+   * @param withType Whether to involve the row type
+   */
+  private static Object[] collect(
+      RelNode rel,
+      List<Pair<String, Object>> contents,
+      boolean withType) {
+    List<Object> hashCodeItems = new ArrayList<>();
+    // The type name.
+    hashCodeItems.add(rel.getRelTypeName());
+    // The traits.
+    hashCodeItems.addAll(rel.getTraitSet());
+    // The hints.
+    if (rel instanceof Hintable) {
+      hashCodeItems.addAll(((Hintable) rel).getHints());
+    }
+    if (withType) {
+      // The row type sans field names.
+      RelDataType relType = rel.getRowType();
+      if (relType.isStruct()) {
+        hashCodeItems.addAll(Pair.right(relType.getFieldList()));
+      } else {
+        // Make a decision here because
+        // some downstream projects have custom rel type which has no explicit 
fields.
+        hashCodeItems.add(relType);
+      }
+    }
+    // The rel node contents(e.g. the inputs or exprs).
+    hashCodeItems.addAll(contents);
+    return hashCodeItems.toArray();
+  }
+
+  /** Normalizes the rel node properties, currently, just to replace the
+   * {@link RelNode} with a simple string format digest. **/
+  private static List<Pair<String, Object>> normalizeContents(
+      List<Pair<String, Object>> items) {
+    List<Pair<String, Object>> normalized = new ArrayList<>();
+    for (Pair<String, Object> item : items) {
+      if (item.right instanceof RelNode) {
+        RelNode input = (RelNode) item.right;
+        normalized.add(Pair.of(item.left, simpleRelDigest(input)));
+      } else {
+        normalized.add(item);
+      }
+    }
+    return normalized;
+  }
+
+  /**
+   * Returns a simple string format digest.
+   *
+   * <p>There are three kinds of nodes we need to handle:
+   *
+   * <ul>
+   * <li>RelSubset: composition of class name, set id and traits;</li>
+   * <li>HepRelVertex: composition of class name and current rel id;</li>
+   * <li>Normal rel: composition of class name and id.</li>
+   * </ul>
+   *
+   * @param rel The rel
+   */
+  private static String simpleRelDigest(RelNode rel) {
+    StringBuilder digest = new StringBuilder(rel.getRelTypeName());
+    digest.append('#');
+    if (rel instanceof RelSubset) {

Review comment:
       For current impl, the digest **is** handled by each `rel` node. Split 
the code everywhere is hard to maintain, let the `Digest` take care how the 
digest should be generated but the invoker is still each node.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to