crepererum commented on code in PR #5419:
URL: https://github.com/apache/arrow-datafusion/pull/5419#discussion_r1121397443
##########
datafusion/physical-expr/src/utils.rs:
##########
@@ -235,6 +239,80 @@ pub fn ordering_satisfy_concrete<F: FnOnce() ->
EquivalenceProperties>(
}
}
+/// Extract referenced [`Column`]s within a [`PhysicalExpr`].
+///
+/// This works recursively.
+pub fn get_phys_expr_columns(pred: &Arc<dyn PhysicalExpr>) -> HashSet<Column> {
+ let mut rewriter = ColumnCollector::default();
Review Comment:
@ozankabak the issue w/ `Vec` is that you have a `O(n^2)` complexity in the
number of used columns. In InfluxDB IOx we sometimes have schemas w/ over 200
columns and I'm somewhat worried that such a simple oversight quickly turns
into a performance bug.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]