cloud-fan commented on pull request #29593:
URL: https://github.com/apache/spark/pull/29593#issuecomment-685265949
The steps lead to this bug are:
1. get logical link of the query stage (Aggregate)
2. get the corresponding physical plan sub-tree (should match final agg, but
we match shuffle stage due to the bug)
3. replace Aggregate in the logical plan with logical query stage("partial
agg -> shuffle stage"), and get "logical query stage -> repartition"
4. reoptimize and re-plan, the new physical plan has no final agg
So the key is step 2. We are fine as long as we don't override the logical
plan tag in final aggregate with `Repartition`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]