cxzl25 opened a new pull request #30036: URL: https://github.com/apache/spark/pull/30036
### What changes were proposed in this pull request? Avoid distribute user jar from driver in yarn client mode. Add the user's jar to `--jars`, this can be automatically uploaded to `spark.yarn.stagingDir`, usually hdfs. Download user jar from stagingDir when executor is initialized. ### Why are the changes needed? When the number of applied executors is large and the jar size is large, the executor pulls the jar from the driver, and the driver network traffic is high, and a timeout may occur. The driver and the executor of the yarn cluster may not be in the same data center. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? exist UT ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
