Copilot commented on code in PR #19448:
URL: https://github.com/apache/datafusion/pull/19448#discussion_r2638587412
##########
datafusion/optimizer/src/optimize_projections/required_indices.rs:
##########
@@ -224,3 +242,30 @@ impl RequiredIndices {
self
}
}
+
+fn collect_outer_ref_exprs(
+ indices: &mut Vec<usize>,
+ input_schema: &DFSchemaRef,
+ exprs: &[Expr],
+) {
+ exprs.iter().for_each(|outer_expr| {
+ outer_expr
+ .apply(|expr| {
+ match expr {
+ Expr::Column(col) | Expr::OuterReferenceColumn(_, col) => {
+ push_column_index(indices, input_schema, col);
+ }
+ _ => {}
+ }
+ Ok(TreeNodeRecursion::Continue)
+ })
+ // traversal above is infallible
+ .expect("outer reference traversal should not fail");
+ });
+}
Review Comment:
The function `collect_outer_ref_exprs` does not handle nested subqueries
within outer reference expressions. If an expression in the `outer_ref_columns`
array contains a `ScalarSubquery`, `Exists`, or `InSubquery`, the function
won't recurse into those subqueries' `outer_ref_columns`.
The old implementation (removed `outer_columns` function) handled this by
recursively calling itself for subquery expressions. The new implementation
should match this behavior by also checking for and handling
`Expr::ScalarSubquery`, `Expr::Exists`, and `Expr::InSubquery` in the match
statement, similar to how it's done in the `add_expr` method above (lines
122-136).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]