abellina opened a new pull request, #43627: URL: https://github.com/apache/spark/pull/43627
### What changes were proposed in this pull request? As reported here https://issues.apache.org/jira/browse/SPARK-45762, `ShuffleManager` instances defined in a user jar cannot be use in all cases, unless specified in the `extraClassPath`. We would like to avoid adding extra configurations if this instance is already included in a jar passed via `--jars`. Proposed changes: Refactor code so we initialize the `ShuffleManager` later, after jars have been localized. This is especially necessary in the executor, where we would need to move this initialization until after the `replClassLoader` is updated with jars passed in `--jars`. Before this change, the `ShuffleManager` is instantiated at `SparkEnv` creation. Having to instantiate the `ShuffleManager` this early doesn't work, because user jars have not been localized in all scenarios, and we will fail to load the `ShuffleManager` defined in `--jars`. We propose moving the `ShuffleManager` instantiation to `SparkContext` on the driver, and `Executor`. ### Why are the changes needed? This is not a new API but a change of startup order. The changed are needed to improve the user experience for the user by reducing extra configurations depending on how a spark application is launched. ### Does this PR introduce _any_ user-facing change? Yes, but it's backwards compatible. Users no longer need to specify a `ShuffleManager` jar in `extraClassPath`, but they are able to if they desire. ### How was this patch tested? Added a unit test showing that a test `ShuffleManager` is available after `--jars` are passed, but not without (using local-cluster mode). Tested manually with standalone mode, local-cluster mode, yarn client and cluster mode, k8s. ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
