krinart opened a new pull request, #20658:
URL: https://github.com/apache/datafusion/pull/20658

   ## Which issue does this PR close?
   
   Restore the `already_projected()` guard in the Sort case of 
`select_to_sql_recursively`. Without it, queries with ORDER BY expressions 
generate invalid SQL when the Sort node follows an outer Projection.
   
   ## Problem ##
   
   Two regressions in the DuckDB federation unparser after upgrading to 
DataFusion 52:
   
    1. clickbench q25: Queries like `SELECT "SearchPhrase" ... ORDER BY 
to_timestamp("EventTime") LIMIT 10` produce ORDER BY outside the subquery, 
referencing table names ("hits") that are out of scope.
    2. tpcds q36: ORDER BY with `GROUPING()` expressions loses the 
`derived_sort` subquery alias, placing sort references outside their valid 
scope.
   
   ## Root Cause ##
   
   DF52's optimizer merges `Limit` into `Sort` as a fetch parameter, changing 
the plan shape from `Projection → Limit → Sort → ...` to `Projection → 
Sort(fetch) → ....` The Sort case previously had a guard that wrapped the sort 
into a `derived_sort` subquery when `already_projected()` was true. This guard 
was removed in DF52, causing ORDER BY and LIMIT to be placed on the outer query 
where inner table references are invalid.
   
   ## Fix ##
   
   Re-add the guard at the top of the Sort match arm in 
select_to_sql_recursively:
   
   ```rust
    if select.already_projected() {
        return self.derive_with_dialect_alias("derived_sort", plan, relation, 
false, vec![]);
    }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to