maryannxue commented on a change in pull request #32816:
URL: https://github.com/apache/spark/pull/32816#discussion_r706244614
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/simpleCosting.scala
##########
@@ -35,15 +36,48 @@ case class SimpleCost(value: Long) extends Cost {
}
/**
- * A simple implementation of [[CostEvaluator]], which counts the number of
- * [[ShuffleExchangeLike]] nodes in the plan.
+ * A skew join aware implementation of [[Cost]], which consider shuffle number
and skew join number
Review comment:
Can we add more description on how the cost is calculated in the
presence of skew joins? What happens if there's two or more extra shuffles by
adding a skew join optimization?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]