jonahgao commented on PR #10234: URL: https://github.com/apache/datafusion/pull/10234#issuecomment-2087754040
> The following query works without this PR (and shows that ORDER BY can reference columns from the FROM clause, not just what is in the SELECT list) @alamb `select x from foo order by y` can be covered by [add_missing_columns](https://github.com/apache/datafusion/blob/f8c623fe045d70a87eac8dc8620b74ff73be56d5/datafusion/expr/src/logical_plan/builder.rs#L437), by blindly adding columns into the descendant projection node. Another issue is that we should not run `add_missing_columns` for other `SetExpr`s except SELECT. ```sh DataFusion CLI v37.1.0 > create table t(a int, b int); > select a from t union select 1 order by b; Error during planning: For SELECT DISTINCT, ORDER BY expressions b must appear in select list > select a from t union all select 1 order by b; Schema error: No field named t.a. Valid fields are a, t.b. ``` Doing it for UNION makes the error messages hard to understand. [PostgreSQL](https://www.postgresql.org/docs/current/queries-order.html) refuses to do this, order by can only references output columns. > ORDER BY can be applied to the result of a UNION, INTERSECT, or EXCEPT combination, but in this case it is only permitted to sort by output column names or numbers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org