peter-toth commented on code in PR #37334: URL: https://github.com/apache/spark/pull/37334#discussion_r934697615
########## sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q70.sf100/explain.txt: ########## @@ -157,121 +158,125 @@ Input [2]: [s_state#14, sum#16] Keys [1]: [s_state#14] Functions [1]: [sum(UnscaledValue(ss_net_profit#10))] Aggregate Attributes [1]: [sum(UnscaledValue(ss_net_profit#10))#17] -Results [3]: [s_state#14, s_state#14, MakeDecimal(sum(UnscaledValue(ss_net_profit#10))#17,17,2) AS _w2#18] +Results [3]: [s_state#14 AS s_state#18, s_state#14, MakeDecimal(sum(UnscaledValue(ss_net_profit#10))#17,17,2) AS _w2#19] -(25) Sort [codegen id : 5] -Input [3]: [s_state#14, s_state#14, _w2#18] -Arguments: [s_state#14 ASC NULLS FIRST, _w2#18 DESC NULLS LAST], false, 0 +(25) Exchange Review Comment: Yes we have. I think that's because a `HashAggregate` in q70 have an output attribute and also have it aliased as a new attribute (we kept that alias with this PR to avoid the same output attribute appear multiple times). ``` -Results [3]: [s_state#13, s_state#13, MakeDecimal(sum(UnscaledValue(ss_net_profit#10))#17,17,2) AS _w2#18] +Results [3]: [s_state#13 AS s_state#18, s_state#13, MakeDecimal(sum(UnscaledValue(ss_net_profit#10))#17,17,2) AS _w2#19] ``` IMO that shouldn't be a bad thing and the plan looks ok. But the root cause of the issue is in `AliasAwareOutputPartitioning` as it generates `outputPartitioning` based on the aliased output attribute only and doesn't produce a `PartitioningCollection` with both the base attribute and the aliased version. Because the parent of the aggregate requires a distribution based on the base attribute `EnsureRequirements` inserts the extra exchange... We also have test failures, I'm not sure yet why. Will check tomorrow... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
