krinart opened a new pull request, #20658: URL: https://github.com/apache/datafusion/pull/20658
## Which issue does this PR close? - closes https://github.com/apache/datafusion/issues/20905 Restore the `already_projected()` guard in the Sort case of `select_to_sql_recursively`. Without it, queries with ORDER BY expressions generate invalid SQL when the Sort node follows an outer Projection. ## Problem ## Two regressions in the DuckDB federation unparser after upgrading to DataFusion 52: 1. clickbench q25: Queries like `SELECT "SearchPhrase" ... ORDER BY to_timestamp("EventTime") LIMIT 10` produce ORDER BY outside the subquery, referencing table names ("hits") that are out of scope. 2. tpcds q36: ORDER BY with `GROUPING()` expressions loses the `derived_sort` subquery alias, placing sort references outside their valid scope. ## Root Cause ## DF52's optimizer merges `Limit` into `Sort` as a fetch parameter, changing the plan shape from `Projection → Limit → Sort → ...` to `Projection → Sort(fetch) → ....` The Sort case previously had a guard that wrapped the sort into a `derived_sort` subquery when `already_projected()` was true. This guard was removed in DF52, causing ORDER BY and LIMIT to be placed on the outer query where inner table references are invalid. ## Fix ## Re-add the guard at the top of the Sort match arm in select_to_sql_recursively: ```rust if select.already_projected() { return self.derive_with_dialect_alias("derived_sort", plan, relation, false, vec![]); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
