alamb commented on code in PR #5421:
URL: https://github.com/apache/arrow-datafusion/pull/5421#discussion_r1120754049


##########
datafusion/common/src/dfschema.rs:
##########
@@ -496,6 +497,15 @@ impl From<DFSchema> for SchemaRef {
     }
 }
 
+// Hashing refers to a subset of fields considered in PartialEq.
+#[allow(clippy::derive_hash_xor_eq)]
+impl Hash for DFSchema {
+    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
+        self.fields.hash(state);
+        self.metadata.len().hash(state); // HashMap is not hashable

Review Comment:
   I agree it is ok to just use the metadata's length to hash as it satisfies 
the EQ constraint
   
   https://doc.rust-lang.org/std/hash/trait.Hash.html#hash-and-eq



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1515,8 +1515,30 @@ pub struct TableScan {
     pub fetch: Option<usize>,
 }
 
+impl PartialEq for TableScan {
+    fn eq(&self, other: &Self) -> bool {
+        self.table_name == other.table_name
+            && self.projection == other.projection
+            && self.projected_schema == other.projected_schema
+            && self.filters == other.filters
+            && self.fetch == other.fetch
+    }
+}
+
+impl Eq for TableScan {}
+
+impl Hash for TableScan {
+    fn hash<H: Hasher>(&self, state: &mut H) {
+        self.table_name.hash(state);
+        self.projection.hash(state);

Review Comment:
   Hash is ok that it doesn't also include `source` I think



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1685,8 +1725,33 @@ pub struct Extension {
     pub node: Arc<dyn UserDefinedLogicalNode>,
 }
 
+struct ExtensionExplainDisplay<'a> {
+    extension: &'a Extension,
+}
+
+impl Display for ExtensionExplainDisplay<'_> {
+    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
+        self.extension.node.fmt_for_explain(f)
+    }
+}
+
+impl PartialEq for Extension {
+    fn eq(&self, other: &Self) -> bool {
+        format!("{}", ExtensionExplainDisplay { extension: self })
+            == format!("{}", ExtensionExplainDisplay { extension: other })
+    }
+}
+
+impl Eq for Extension {}
+
+impl Hash for Extension {

Review Comment:
   I am not sure about using the textual output for equality comparison as 
someone who has implemented Extension may have a constant output, for example. 
I think a safer (though backwards incompatible change) would be to make 
`UserDefinedLogicalNode` also be `Hash` and `PartialEq`
   
   Like
   
   ```rust
   trait UserDefinedLogicalNode: Hash + PartialEq {
   ...
   }
   ```



##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -326,13 +327,12 @@ impl Optimizer {
             // TODO this is an expensive way to see if the optimizer did 
anything and
             // it would be better to change the OptimizerRule trait to return 
an Option
             // instead
-            let new_plan_str = format!("{}", new_plan.display_indent());
-            if plan_str == new_plan_str {
+            if old_plan.as_ref() == &new_plan {

Review Comment:
   👍  nice
   
   



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1515,8 +1515,30 @@ pub struct TableScan {
     pub fetch: Option<usize>,
 }
 
+impl PartialEq for TableScan {
+    fn eq(&self, other: &Self) -> bool {
+        self.table_name == other.table_name

Review Comment:
   I think this also needs to check that the `source` is equal as well -- I 
think we can do so via 
https://doc.rust-lang.org/std/sync/struct.Arc.html#method.ptr_eq



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1846,7 +1911,7 @@ impl Join {
 }
 
 /// Subquery
-#[derive(Clone)]
+#[derive(Clone, PartialEq, Eq, Hash)]

Review Comment:
   🎉 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to