adriangb opened a new pull request, #20362: URL: https://github.com/apache/datafusion/pull/20362
## Summary - Replace `ordered_column_indices_from_projection` with `resolve_sort_column_projection` which only requires sort-column positions to resolve to simple `Column` expressions, rather than failing the entire projection if any expression is complex - Evaluate each ordering independently in `get_projected_output_ordering`: orderings on simple column refs get validated with min/max statistics even when other projection expressions are complex (e.g. `a + 1`) - For orderings where a sort column is itself a complex expression, fall back to the single-file-group check **Problem:** After projection pushdown, complex expressions in `ProjectionExprs` are common (e.g. `SELECT a + 1 AS x, b, c FROM t ORDER BY b`). The old `ordered_column_indices_from_projection` was all-or-nothing: it failed on `BinaryExpr(a+1)` at index 0 and returned `None` for the entire projection, even though the ordering on `b` (index 1) maps cleanly to a simple `Column`. With multi-file groups, this caused valid orderings to be unnecessarily dropped. ## Test plan - [x] `cargo test -p datafusion-datasource` (97 tests pass) - [x] `cargo test -p datafusion-sqllogictest --test sqllogictests -- parquet_sorted_statistics` (passes) 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
