alamb opened a new issue, #12446: URL: https://github.com/apache/datafusion/issues/12446
### Describe the bug There is a regression that was added that in a very very specific circumstance with sorted data and constant predicates and `UNION` queries where the query will now error with a `SanityCheckPlan` error when it should complete. ### To Reproduce @wiedld found a reproducer as part of https://github.com/apache/datafusion/pull/12414 https://github.com/apache/datafusion/commit/c2e652e48c82aedb20c289cbf5e7a7e279aa436e ```sql # Test: inputs into union with different orderings query TT explain select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1 union all select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2 order by d, c, a, a0, b limit 2; ---- logical_plan 01)Projection: t1.b, t1.c, t1.a, t1.a0 02)--Sort: t1.d ASC NULLS LAST, t1.c ASC NULLS LAST, t1.a ASC NULLS LAST, t1.a0 ASC NULLS LAST, t1.b ASC NULLS LAST, fetch=2 03)----Union 04)------SubqueryAlias: t1 05)--------Projection: ordered_table.b, ordered_table.c, ordered_table.a, Int32(NULL) AS a0, ordered_table.d 06)----------TableScan: ordered_table projection=[a, b, c, d] 07)------SubqueryAlias: t2 08)--------Projection: ordered_table.b, ordered_table.c, Int32(NULL) AS a, ordered_table.a0, ordered_table.d 09)----------TableScan: ordered_table projection=[a0, b, c, d] # Test: run the query from above # TODO: query fails since the constant columns t1.a0 and t2.a are not in the ORDER BY subquery, # and SanityCheckPlan does not allow this. statement error DataFusion error: SanityCheckPlan select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1 union all select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2 order by d, c, a, a0, b limit 2; statement ok drop table ordered_table; ``` ### Expected behavior Query should run ### Additional context We believe this was introduced by https://github.com/apache/datafusion/pull/11196 This was released in 40.0.0 https://github.com/apache/datafusion/blob/main/dev/changelog/40.0.0.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
