berkaysynnada commented on code in PR #7364: URL: https://github.com/apache/arrow-datafusion/pull/7364#discussion_r1303136856
########## datafusion/sqllogictest/test_files/order.slt: ########## @@ -410,3 +410,38 @@ SELECT DISTINCT time as "first_seen" FROM t ORDER BY 1; ## Cleanup statement ok drop table t; + +# Create a table having 3 columns which are ordering equivalent by the source. In the next step, +# we will expect to observe the removed sort exec by propagating the orders across projection. +statement ok +CREATE EXTERNAL TABLE multiple_ordered_table ( + a0 INTEGER, + a INTEGER, + b INTEGER, + c INTEGER, + d INTEGER +) +STORED AS CSV +WITH HEADER ROW +WITH ORDER (a ASC) +WITH ORDER (b ASC) +WITH ORDER (c ASC) +LOCATION '../core/tests/data/window_2.csv'; + +query TT +EXPLAIN SELECT (b+a+c) AS result +FROM multiple_ordered_table +ORDER BY result; +---- +logical_plan +Sort: result ASC NULLS LAST +--Projection: multiple_ordered_table.b + multiple_ordered_table.a + multiple_ordered_table.c AS result +----TableScan: multiple_ordered_table projection=[a, b, c] +physical_plan +SortPreservingMergeExec: [result@0 ASC NULLS LAST] Review Comment: RepartitionExec does not break the order of partitions in this case as it partitions 1 to 4. Because these partitions are already ordered, the presence of SortPreservingMerge is correct. For general cases, SortPreservingRepartitionExec implementation is on the way, which will have the capability to preserve order for all kinds of partitioning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
