adriangb commented on code in PR #22239:
URL: https://github.com/apache/datafusion/pull/22239#discussion_r3293852050
##########
datafusion/sqllogictest/test_files/order.slt:
##########
@@ -1705,15 +1705,16 @@ EXPLAIN SELECT named_struct('sum', a + b) AS s FROM
ordered ORDER BY s['sum'];
----
physical_plan DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]},
projection=[named_struct(sum, a@0 + b@1) as s], file_type=csv, has_header=true
-# Wrapping a non-ordered column into a struct — SortExec required
+# Wrapping a non-ordered column into a struct — SortExec required.
# Reuses the `ordered` table above which has WITH ORDER (a + b).
+# The simplifier resolves `get_field(named_struct(...), 'a')` so the sort key
+# is not extracted into a separate scan projection column.
query TT
EXPLAIN SELECT named_struct('a', a, 'b', b) AS s FROM ordered ORDER BY s['a'];
----
physical_plan
-01)ProjectionExec: expr=[s@0 as s]
-02)--SortExec: expr=[__datafusion_extracted_1@1 ASC NULLS LAST],
preserve_partitioning=[false]
-03)----DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]},
projection=[named_struct(a, a@0, b, b@1) as s, get_field(named_struct(a, a@0,
b, b@1), a) as __datafusion_extracted_1], file_type=csv, has_header=true
+01)SortExec: expr=[get_field(s@0, a) ASC NULLS LAST],
preserve_partitioning=[false]
+02)--DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]},
projection=[named_struct(a, a@0, b, b@1) as s], file_type=csv, has_header=true
# Simple column ordering tests using a table ordered by (a)
statement ok
Review Comment:
Good observation. Filed a tracking issue for the physical sort-key
normalization opportunity: apache/datafusion#22487. As noted there, this is a
tradeoff rather than a clear regression (sort keys are materialized into an
array once before sorting, and the new shape drops the extra extracted column +
recovery projection), so the issue calls for a benchmark before investing in
the rewrite.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]