adriangb commented on code in PR #22239:
URL: https://github.com/apache/datafusion/pull/22239#discussion_r3293852050


##########
datafusion/sqllogictest/test_files/order.slt:
##########
@@ -1705,15 +1705,16 @@ EXPLAIN SELECT named_struct('sum', a + b) AS s FROM 
ordered ORDER BY s['sum'];
 ----
 physical_plan DataSourceExec: file_groups={1 group: 
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]}, 
projection=[named_struct(sum, a@0 + b@1) as s], file_type=csv, has_header=true
 
-# Wrapping a non-ordered column into a struct — SortExec required
+# Wrapping a non-ordered column into a struct — SortExec required.
 # Reuses the `ordered` table above which has WITH ORDER (a + b).
+# The simplifier resolves `get_field(named_struct(...), 'a')` so the sort key
+# is not extracted into a separate scan projection column.
 query TT
 EXPLAIN SELECT named_struct('a', a, 'b', b) AS s FROM ordered ORDER BY s['a'];
 ----
 physical_plan
-01)ProjectionExec: expr=[s@0 as s]
-02)--SortExec: expr=[__datafusion_extracted_1@1 ASC NULLS LAST], 
preserve_partitioning=[false]
-03)----DataSourceExec: file_groups={1 group: 
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]}, 
projection=[named_struct(a, a@0, b, b@1) as s, get_field(named_struct(a, a@0, 
b, b@1), a) as __datafusion_extracted_1], file_type=csv, has_header=true
+01)SortExec: expr=[get_field(s@0, a) ASC NULLS LAST], 
preserve_partitioning=[false]
+02)--DataSourceExec: file_groups={1 group: 
[[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]}, 
projection=[named_struct(a, a@0, b, b@1) as s], file_type=csv, has_header=true
 
 # Simple column ordering tests using a table ordered by (a)
 statement ok

Review Comment:
   Good observation. Filed a tracking issue for the physical sort-key 
normalization opportunity: apache/datafusion#22487. As noted there, this is a 
tradeoff rather than a clear regression (sort keys are materialized into an 
array once before sorting, and the new shape drops the extra extracted column + 
recovery projection), so the issue calls for a benchmark before investing in 
the rewrite.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to