warrenzhu25 commented on PR #41076: URL: https://github.com/apache/spark/pull/41076#issuecomment-1609805350
> > No, I'm not saying the default value. The default value should be 1. > > > maxRatio has default value 1, so it's same as current behavior. > > > > > > When the new configuration `maxRatio` itself is controversial in terms of the benefits, it's just a dark-magic to confuse the users to put into the mud holes. Why do we need to add this dark-magic configuration, @warrenzhu25 ? > > > Users has full control of this ratio based on their judgement of storage migration size and impact. > > I agree this config is not a perfect solution to solve this issue of migration competing with shuffle fetch. But it provides a feasible workaround to limit the impact of data migration. In our prod, using `maxRatio` of 0.3 could have similar perf as without shuffle migration while `maxRatio` of 0.5 or large has significant perf regression. @dongjoon-hyun Rethinking of this, this config could be considered as `scaleDownFactor`. Usually, scaling has scaleUp and scaleDown factor. In spark's context, `executorAllocationRation` is `scaleUpFactor`, but it don't have `scaleDownFactor`. Let's say we have shuffle map stage with 1000 partitions, reduce stage has 100 partition. When reduce stage started, we only need 10 precent of executors, which means 90 precent of executors will be decommissioned, which has huge perf impact due to IO competition caused by shuffle migration. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
