peter-toth commented on code in PR #37334:
URL: https://github.com/apache/spark/pull/37334#discussion_r935756011
##########
sql/core/src/test/scala/org/apache/spark/sql/execution/CoalesceShufflePartitionsSuite.scala:
##########
@@ -339,12 +339,12 @@ class CoalesceShufflePartitionsSuite extends
SparkFunSuite {
// ShuffleQueryStage 0
// ShuffleQueryStage 2
// ReusedQueryStage 0
- val grouped = df.groupBy("key").agg(max("value").as("value"))
+ val grouped = df.groupBy((col("key") +
1).as("key")).agg(max("value").as("value"))
Review Comment:
I had to modify the test because the `AliasAwareOutputPartitioning` fix
modified the the explain plan of the original query from:
```
Union
:- *(5) HashAggregate(keys=[_groupingexpression#79L],
functions=[max(value#38L)], output=[(key + 1)#44L, max(value)#45L])
: +- AQEShuffleRead coalesced
: +- ShuffleQueryStage 3
: +- Exchange hashpartitioning(_groupingexpression#79L, 5),
ENSURE_REQUIREMENTS, [plan_id=693]
: +- *(3) HashAggregate(keys=[_groupingexpression#79L],
functions=[partial_max(value#38L)], output=[_groupingexpression#79L, max#62L])
: +- *(3) HashAggregate(keys=[key#12L],
functions=[max(value#13L)], output=[value#38L, _groupingexpression#79L])
: +- AQEShuffleRead coalesced
: +- ShuffleQueryStage 0
: +- Exchange hashpartitioning(key#12L, 5),
ENSURE_REQUIREMENTS, [plan_id=623]
: +- *(1) HashAggregate(keys=[key#12L],
functions=[partial_max(value#13L)], output=[key#12L, max#64L])
: +- *(1) Project [id#10L AS key#12L, id#10L AS
value#13L]
: +- *(1) Range (0, 6, step=1, splits=10)
+- *(6) HashAggregate(keys=[_groupingexpression#80L],
functions=[max(value#38L)], output=[(key + 2)#51L, max(value)#52L])
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 4
+- Exchange hashpartitioning(_groupingexpression#80L, 5),
ENSURE_REQUIREMENTS, [plan_id=719]
+- *(4) HashAggregate(keys=[_groupingexpression#80L],
functions=[partial_max(value#38L)], output=[_groupingexpression#80L, max#66L])
+- *(4) HashAggregate(keys=[key#12L],
functions=[max(value#13L)], output=[value#38L, _groupingexpression#80L])
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 2
+- ReusedExchange [key#12L, max#64L], Exchange
hashpartitioning(key#12L, 5), ENSURE_REQUIREMENTS, [plan_id=623]
```
to (1 less exchange):
```
Union
:- *(3) HashAggregate(keys=[_groupingexpression#75L],
functions=[max(value#38L)], output=[(key + 1)#44L, max(value)#45L])
: +- AQEShuffleRead coalesced
: +- ShuffleQueryStage 0
: +- Exchange hashpartitioning(_groupingexpression#75L, 5),
ENSURE_REQUIREMENTS, [plan_id=514]
: +- *(1) HashAggregate(keys=[_groupingexpression#75L],
functions=[partial_max(value#38L)], output=[_groupingexpression#75L, max#62L])
: +- *(1) HashAggregate(keys=[key#12L],
functions=[max(value#13L)], output=[value#38L, _groupingexpression#75L])
: +- *(1) HashAggregate(keys=[key#12L],
functions=[partial_max(value#13L)], output=[key#12L, max#64L])
: +- *(1) Project [id#10L AS key#12L, id#10L AS value#13L]
: +- *(1) Range (0, 6, step=1, splits=10)
+- *(4) HashAggregate(keys=[_groupingexpression#76L],
functions=[max(value#38L)], output=[(key + 2)#51L, max(value)#52L])
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 1
+- Exchange hashpartitioning(_groupingexpression#76L, 5),
ENSURE_REQUIREMENTS, [plan_id=532]
+- *(2) HashAggregate(keys=[_groupingexpression#76L],
functions=[partial_max(value#38L)], output=[_groupingexpression#76L, max#66L])
+- *(2) HashAggregate(keys=[key#12L],
functions=[max(value#13L)], output=[value#38L, _groupingexpression#76L])
+- *(2) HashAggregate(keys=[key#12L],
functions=[partial_max(value#13L)], output=[key#12L, max#64L])
+- *(2) Project [id#55L AS key#12L, id#55L AS value#13L]
+- *(2) Range (0, 6, step=1, splits=10)
```
and so the query didn't match the `test case 2` description.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]