manuzhang commented on pull request #28669: URL: https://github.com/apache/spark/pull/28669#issuecomment-641793942
@JkSelf That will work but the value doesn't make sense to me. In addition, I don't want to tune the configuration for every specific case but come up with a suite of default configurations that work out of box for most cases. > I think this PR is meaningful when the rule of OptimizeSkewedJoin is behind CoalesceShufflePartitions in the queryStageOptimizerRules of AdaptiveSparkPlanExec I don't see too much "meaning" here as we split the skewed partition towards the average size of coalesced non-skew partitions or advisory target size whichever is larger. We'll end up with evenly distributed partitions with or without this PR. However, we lose the chance to optimize skew partitions in some cases with this change. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
