alamb commented on code in PR #17442:
URL: https://github.com/apache/datafusion/pull/17442#discussion_r2325732445


##########
datafusion/physical-expr-common/src/sort_expr.rs:
##########
@@ -414,27 +427,28 @@ impl LexOrdering {
         self.exprs.truncate(len);
         true
     }
+}
 
-    /// Constructs a new `LexOrdering` from the given sort requirements w/o
-    /// enforcing non-degeneracy. This function is used internally and is not
-    /// meant (or safe) for external use.
-    fn construct(exprs: impl IntoIterator<Item = PhysicalSortExpr>) -> (bool, 
Self) {
-        let mut set = HashSet::new();
-        let exprs = exprs
-            .into_iter()
-            .filter_map(|s| set.insert(Arc::clone(&s.expr)).then_some(s))
-            .collect();
-        (!set.is_empty(), Self { exprs, set })
+impl PartialEq for LexOrdering {
+    fn eq(&self, other: &Self) -> bool {
+        let Self {
+            exprs,
+            set: _, // derived from `exprs`

Review Comment:
   👍 



##########
datafusion/physical-expr-common/src/sort_expr.rs:
##########
@@ -367,8 +367,21 @@ impl LexOrdering {
     /// Creates a new [`LexOrdering`] from the given vector of sort 
expressions.
     /// If the vector is empty, returns `None`.
     pub fn new(exprs: impl IntoIterator<Item = PhysicalSortExpr>) -> 
Option<Self> {
-        let (non_empty, ordering) = Self::construct(exprs);
-        non_empty.then_some(ordering)
+        let exprs = exprs.into_iter();

Review Comment:
   I am not sure this is better / faster than what was previously used to 
comute exprs and populate the hashset:
   
   ```rust
       let exprs = exprs
               .into_iter()
               .filter_map(|s| set.insert(Arc::clone(&s.expr)).then_some(s))
               .collect();
   ```
   
   Specifically, I think this will now potentially allocate more memory than 
required (it has space for all items, but we may filter out some)
   
   I personally suggest avoiding the Vec::with_capacity call unless we are sure 
it is better
   



##########
datafusion/physical-expr-common/src/sort_expr.rs:
##########
@@ -414,27 +427,28 @@ impl LexOrdering {
         self.exprs.truncate(len);
         true
     }
+}
 
-    /// Constructs a new `LexOrdering` from the given sort requirements w/o
-    /// enforcing non-degeneracy. This function is used internally and is not
-    /// meant (or safe) for external use.
-    fn construct(exprs: impl IntoIterator<Item = PhysicalSortExpr>) -> (bool, 
Self) {
-        let mut set = HashSet::new();
-        let exprs = exprs
-            .into_iter()
-            .filter_map(|s| set.insert(Arc::clone(&s.expr)).then_some(s))
-            .collect();
-        (!set.is_empty(), Self { exprs, set })
+impl PartialEq for LexOrdering {
+    fn eq(&self, other: &Self) -> bool {
+        let Self {
+            exprs,
+            set: _, // derived from `exprs`
+        } = self;
+        // PartialEq must be consistent with PartialOrd
+        exprs == &other.exprs
     }
 }
-
+impl Eq for LexOrdering {}
 impl PartialOrd for LexOrdering {
     /// There is a partial ordering among `LexOrdering` objects. For example, 
the
     /// ordering `[a ASC]` is coarser (less) than ordering `[a ASC, b ASC]`.
     /// If two orderings do not share a prefix, they are incomparable.
     fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
-        self.iter()
-            .zip(other.iter())
+        // PartialEq must be consistent with PartialOrd

Review Comment:
   I think this results in the same speed as the previous code, as `iter()` 
returns `exprs.iter()`
   
   
https://github.com/apache/datafusion/blob/cc43f5c107bf10daa88be2c58af3042f8aef9108/datafusion/physical-expr-common/src/sort_expr.rs#L494-L493



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to