alamb commented on code in PR #16185: URL: https://github.com/apache/datafusion/pull/16185#discussion_r2110078689
########## datafusion/physical-expr/src/equivalence/class.rs: ########## @@ -422,6 +423,60 @@ impl EquivalenceGroup { self.bridge_classes() } + /// Returns a set of all adjacent expression pairs in this equivalence group. + /// + /// For each equivalence class, this function creates pairs of expressions that are adjacent + /// to each other. For example, if we have an equivalence class [a, b, c], it will generate + /// pairs (a,b), (a,c), (b,c) and their reverse pairs (b,a), (c,a), (c,b). + /// + /// This is used by the `intersect` function to find common expression pairs between + /// two equivalence groups. + #[allow(clippy::type_complexity)] + fn adjacent_set(&self) -> HashSet<(&Arc<dyn PhysicalExpr>, &Arc<dyn PhysicalExpr>)> { + let mut set = HashSet::new(); Review Comment: this algorithm is `O(N^2)` in the number of items in the equivalence class, right? We have hit other really long planning times when using such algorithms, for example - https://github.com/apache/datafusion/issues/13748 Is there some way to make this more efficient? Maybe by creating an 2x boolean array to represent the intersections? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org