hiboyang opened a new pull request #32031: URL: https://github.com/apache/spark/pull/32031
### What changes were proposed in this pull request? This PR contains Remote Shuffle Service to support dynamic allocation on Kubernetes. The code is mostly copied from [Uber Remote Shuffle Service](https://github.com/uber/RemoteShuffleService) and modified with some renaming. Also added Kubernetes related support which does not exist in original Uber Remote Shuffle Service. ### Why are the changes needed? It is still difficult to use dynamic allocation with Spark on Kubernetes. There are several disaggregated/remote shuffle solutions in different companies. Hopefully we could get a remote shuffle implementation into Spark and enhanced in the future by the Spark community. ### Does this PR introduce _any_ user-facing change? Yes, user could set Spark config (spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager) to run Spark applications with remote shuffle service. It will make Spark use the new RssShuffleManager to write/read shuffle data to/from remote shuffle service. ### How was this patch tested? Manually tested with Spark application in Kubernetes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
