adriangb opened a new pull request, #20362:
URL: https://github.com/apache/datafusion/pull/20362

   ## Summary
   
   - Replace `ordered_column_indices_from_projection` with 
`resolve_sort_column_projection` which only requires sort-column positions to 
resolve to simple `Column` expressions, rather than failing the entire 
projection if any expression is complex
   - Evaluate each ordering independently in `get_projected_output_ordering`: 
orderings on simple column refs get validated with min/max statistics even when 
other projection expressions are complex (e.g. `a + 1`)
   - For orderings where a sort column is itself a complex expression, fall 
back to the single-file-group check
   
   **Problem:** After projection pushdown, complex expressions in 
`ProjectionExprs` are common (e.g. `SELECT a + 1 AS x, b, c FROM t ORDER BY 
b`). The old `ordered_column_indices_from_projection` was all-or-nothing: it 
failed on `BinaryExpr(a+1)` at index 0 and returned `None` for the entire 
projection, even though the ordering on `b` (index 1) maps cleanly to a simple 
`Column`. With multi-file groups, this caused valid orderings to be 
unnecessarily dropped.
   
   ## Test plan
   
   - [x] `cargo test -p datafusion-datasource` (97 tests pass)
   - [x] `cargo test -p datafusion-sqllogictest --test sqllogictests -- 
parquet_sorted_statistics` (passes)
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to