alamb commented on code in PR #16185:
URL: https://github.com/apache/datafusion/pull/16185#discussion_r2110078689


##########
datafusion/physical-expr/src/equivalence/class.rs:
##########
@@ -422,6 +423,60 @@ impl EquivalenceGroup {
         self.bridge_classes()
     }
 
+    /// Returns a set of all adjacent expression pairs in this equivalence 
group.
+    ///
+    /// For each equivalence class, this function creates pairs of expressions 
that are adjacent
+    /// to each other. For example, if we have an equivalence class [a, b, c], 
it will generate
+    /// pairs (a,b), (a,c), (b,c) and their reverse pairs (b,a), (c,a), (c,b).
+    ///
+    /// This is used by the `intersect` function to find common expression 
pairs between
+    /// two equivalence groups.
+    #[allow(clippy::type_complexity)]
+    fn adjacent_set(&self) -> HashSet<(&Arc<dyn PhysicalExpr>, &Arc<dyn 
PhysicalExpr>)> {
+        let mut set = HashSet::new();

Review Comment:
   this algorithm is `O(N^2)` in the number of items in the equivalence class, 
right? 
   
   We have hit other really long planning times when using such algorithms, for 
example
   - https://github.com/apache/datafusion/issues/13748
   
   Is there some way to make this more efficient? Maybe by creating an 2x 
boolean array to represent the intersections?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to