cloud-fan commented on code in PR #37612:
URL: https://github.com/apache/spark/pull/37612#discussion_r954560331
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEUtils.scala:
##########
@@ -28,16 +28,31 @@ object AQEUtils {
def getRequiredDistribution(p: SparkPlan): Option[Distribution] = p match {
// User-specified repartition is only effective when it's the root node,
or under
// Project/Filter/LocalSort/CollectMetrics.
- // Note: we only care about `HashPartitioning` as `EnsureRequirements` can
only optimize out
- // user-specified repartition with `HashPartitioning`.
- case ShuffleExchangeExec(h: HashPartitioning, _, shuffleOrigin)
+ // Note, here are two cases of how user-specified repartition can be
optimized out:
+ // 1. `EnsureRequirements` can only optimize out user-specified
repartition with
+ // `HashPartitioning`.
+ // 2. `AQEOptimizer` can optimize out user-specified repartition with all
`Partitioning`,
+ // e.g. convert empty to local relation.
Review Comment:
OK let me make my proposal clear: let's not optimize out repartition if it's
the root node, or below Project/Filter, in any cases. What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]