krinart opened a new pull request, #20658:
URL: https://github.com/apache/datafusion/pull/20658
## Which issue does this PR close?
Restore the `already_projected()` guard in the Sort case of
`select_to_sql_recursively`. Without it, queries with ORDER BY expressions
generate invalid SQL when the Sort node follows an outer Projection.
## Problem ##
Two regressions in the DuckDB federation unparser after upgrading to
DataFusion 52:
1. clickbench q25: Queries like `SELECT "SearchPhrase" ... ORDER BY
to_timestamp("EventTime") LIMIT 10` produce ORDER BY outside the subquery,
referencing table names ("hits") that are out of scope.
2. tpcds q36: ORDER BY with `GROUPING()` expressions loses the
`derived_sort` subquery alias, placing sort references outside their valid
scope.
## Root Cause ##
DF52's optimizer merges `Limit` into `Sort` as a fetch parameter, changing
the plan shape from `Projection → Limit → Sort → ...` to `Projection →
Sort(fetch) → ....` The Sort case previously had a guard that wrapped the sort
into a `derived_sort` subquery when `already_projected()` was true. This guard
was removed in DF52, causing ORDER BY and LIMIT to be placed on the outer query
where inner table references are invalid.
## Fix ##
Re-add the guard at the top of the Sort match arm in
select_to_sql_recursively:
```rust
if select.already_projected() {
return self.derive_with_dialect_alias("derived_sort", plan, relation,
false, vec![]);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]