comphead commented on issue #15886: URL: https://github.com/apache/datafusion/issues/15886#issuecomment-2839773088
Agree this is confusing, DuckDb treats queries below as equal, I believe it just unnests outer query preserving whatever is in the inner query whereas DataFusion optimizes in different way. ``` D explain select x.* from (select 1 a union all select null order by a nulls last) x; ┌─────────────────────────────┐ │┌───────────────────────────┐│ ││ Physical Plan ││ │└───────────────────────────┘│ └─────────────────────────────┘ ┌───────────────────────────┐ │ ORDER_BY │ │ ──────────────────── │ │ a ASC │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ UNION ├──────────────┐ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ PROJECTION ││ PROJECTION │ │ ──────────────────── ││ ──────────────────── │ │ a ││ NULL │ │ ││ │ │ ~1 Rows ││ ~1 Rows │ └─────────────┬─────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ DUMMY_SCAN ││ DUMMY_SCAN │ └───────────────────────────┘└───────────────────────────┘ D explain select x.* from (select 1 a union all select null) x order by a nulls last; ┌─────────────────────────────┐ │┌───────────────────────────┐│ ││ Physical Plan ││ │└───────────────────────────┘│ └─────────────────────────────┘ ┌───────────────────────────┐ │ ORDER_BY │ │ ──────────────────── │ │ x.a ASC │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ UNION ├──────────────┐ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ PROJECTION ││ PROJECTION │ │ ──────────────────── ││ ──────────────────── │ │ a ││ NULL │ │ ││ │ │ ~1 Rows ││ ~1 Rows │ └─────────────┬─────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ DUMMY_SCAN ││ DUMMY_SCAN │ └───────────────────────────┘└───────────────────────────┘ ``` As discussed in Slack https://the-asf.slack.com/archives/C01QUFS30TD/p1745875862723149 it makes sense for the sort by to be applied to the subquery it belongs to. But it in the case above the optimizer should preserve sorting like duckDB does IMHO -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org