zhuqi-lucas commented on code in PR #20362:
URL: https://github.com/apache/datafusion/pull/20362#discussion_r2902008802
##########
datafusion/sqllogictest/test_files/topk.slt:
##########
@@ -371,7 +371,7 @@ explain select number, letter, age, number as column4,
letter as column5 from pa
----
physical_plan
01)SortExec: TopK(fetch=3), expr=[number@0 DESC, letter@1 ASC NULLS LAST,
age@2 DESC], preserve_partitioning=[false], sort_prefix=[number@0 DESC,
letter@1 ASC NULLS LAST]
-02)--DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/topk/partial_sorted/1.parquet]]},
projection=[number, letter, age, number@0 as column4, letter@1 as column5],
output_ordering=[number@0 DESC, letter@1 ASC NULLS LAST], file_type=parquet,
predicate=DynamicFilter [ empty ]
+02)--DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/topk/partial_sorted/1.parquet]]},
projection=[number, letter, age, number@0 as column4, letter@1 as column5],
output_orderings=[[number@0 DESC, letter@1 ASC NULLS LAST], [column4@3 DESC],
[number@0 DESC, column5@4 ASC NULLS LAST]], file_type=parquet,
predicate=DynamicFilter [ empty ]
Review Comment:
Displaying the full `oeq_class()` in EXPLAIN output introduces confusing
entries. For example, given a projection with `number@0 as column4` and
original ordering `[number DESC, letter ASC]`, the output becomes:
```
output_orderings=[[number@0 DESC, letter@1 ASC NULLS LAST], [column4@3
DESC], [number@0 DESC, column5@4 ASC NULLS LAST]]
```
`[column4@3 DESC]` here is problematic — it's not truly an equivalent
ordering, but a **weaker prefix** implied by the first ordering (since `column4
= number`). The dependency chain in `construct_dependency_map` breaks when
mixing alias mappings, producing incomplete orderings. And
`remove_redundant_entries()` doesn't catch it because it doesn't consider
column equivalences.
The ideal equivalent orderings should be:
```
[number DESC, letter ASC]
[column4 DESC, letter ASC]
[number DESC, column5 ASC]
[column4 DESC, column5 ASC]
```
This suggests that showing raw `oeq_class()` to users may not be appropriate
for EXPLAIN — it can be both **redundant** (multiple orderings expressing the
same thing) and **incomplete** (broken dependency chains losing suffix sorts).
The original `output_ordering` (single canonical ordering) was cleaner for
user-facing display.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]