berkaysynnada commented on issue #15886: URL: https://github.com/apache/datafusion/issues/15886#issuecomment-2839131363
> I have to say it was very much unexpected. As a sanity check, I compared to Postgres which does not remove the sorting operation. The Postgres docs say that CTEs "effectively serve as temporary tables that can be referenced from the FROM list" (https://www.postgresql.org/docs/current/sql-select.html), which I would read as to imply that they are not views. There is no documentation under the `ORDER BY` clause that states its applicability to CTEs (or views). > > I think optimizations that change the semantics of the query, legal transformation or not by the SQL standard, should be explicitly opt-in (and would still classify this issue as a bug) If I approach this case practically, when "order by" clauses are given in subqueries: these are converted into SortExecs at somewhere in the plan. However, in enforce_sorting, we don't track ordering requirements through SortExecs directly (otherwise, we wouldn't be able to eliminate truly necessary SortExecs). Instead, we track the requirement by inserting OutputRequirementExec at the top of the plan, which corresponds to the global ordering - that is the ordering expected when an explicit ORDER BY is given in the outermost query. If we decide to introduce a new config for this setting, we first need to improve the optimizer phase. Specifically, OutputRequirementExecs could be inserted into intermediate nodes in the plan, meaning the subplan beneath must guarantee to bring the required ordering at that point. So, TLDR, unless we adapt the current enforce_sorting rule accordingly, we risk losing the ability to eliminate unnecessary SortExecs in some cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org