stuhood commented on code in PR #22590:
URL: https://github.com/apache/datafusion/pull/22590#discussion_r3320078713


##########
datafusion/physical-expr/src/partitioning.rs:
##########
@@ -279,6 +279,54 @@ impl RangePartitioning {
         self.split_points.len() + 1
     }
 
+    /// Returns true when `self` and `other` describe the same range partition
+    /// map.
+    ///
+    /// Single-partition range partitionings are always compatible. Otherwise,
+    /// the two partitionings must have identical split points and equivalent
+    /// ordering expressions with the same sort options.
+    pub fn compatible_with(
+        &self,
+        other: &Self,
+        eq_properties: &EquivalenceProperties,
+    ) -> bool {
+        if self.partition_count() == 1 && other.partition_count() == 1 {
+            return true;
+        }
+
+        if self.split_points != other.split_points

Review Comment:
   `impl PartialEq for Vec` checks the length before comparing the elements, so:
   
   ```suggestion
           if self.split_points != other.split_points
   ```



##########
datafusion/physical-expr/src/partitioning.rs:
##########
@@ -411,6 +459,37 @@ fn compare_scalar_values_for_sort(
     }
 }
 
+fn equivalent_exprs(

Review Comment:
   I'm surprised that this only exists here... it seems like something which 
would exist elsewhere in the codebase?
   
   If this is a common pattern, it might be possible to create a derive macro 
for effectively: `PartialEqWithEquivalenceProperties`... then `compatible_with` 
could defer to a derived implementation.



##########
datafusion/physical-expr/src/partitioning.rs:
##########
@@ -442,6 +521,42 @@ impl Partitioning {
         }
     }
 
+    /// Returns true when `self` and `other` describe compatible partition 
maps.
+    ///
+    /// Compatible partition maps can be used for partition-local behavior: if
+    /// this returns true, partition `i` from both partitionings can be treated
+    /// as covering the same partition domain. This is stricter than
+    /// [`Self::satisfaction`], which only answers whether this partitioning 
can
+    /// satisfy a required distribution.

Review Comment:
   `satisfaction` definitely seems related: it feels like `allow_subset=false` 
would be roughly equivalent to this method?
   
   I don't fully understand the motivation/history behind the distinction 
between `Distribution` and `Partitioning`, but it looks like it will result in 
a bit of redundancy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to