lwwmanning opened a new pull request #24083: [SPARK-24432] Support dynamic allocation without external shuffle service URL: https://github.com/apache/spark/pull/24083 ## What changes were proposed in this pull request? This PR adds a limited version of dynamic allocation that does not require the external shuffle service, and thus works on kubernetes (but does not support preemption). The basic approach is to track which executors are holding shuffle files, and of those, which have shuffle files depended on by active stages. If an executor contains shuffle files depended on by an active stage, then we treat it as "active" (i.e., prevent the ExecutorAllocationManager from marking it as "idle"). If an executor contains only shuffle files that are not dependencies of active stages, then we treat those shuffle files similarly to cached data (i.e., configurable idle timeout that defaults to "infinity"). We also introduce the concept of "shuffle biased task scheduling", a heuristic attempt to schedule tasks for maximal efficacy of dynamic allocation. We do this by attempting to minimize the number of executors that contain (active) shuffle files, by packing as many tasks as possible onto "active" executors first, followed by scheduling them on executors with only inactive shuffle files, and finally all remaining executors. This is a port of a series of PRs on Palantir's spark fork: [Support dynamic allocation without external shuffle service](https://github.com/palantir/spark/pull/427) [Fix dynamic allocation with external shuffle service](https://github.com/palantir/spark/pull/445) [Track active shuffle by stage](https://github.com/palantir/spark/pull/446) [Shuffle biased task scheduling](https://github.com/palantir/spark/pull/447) cc: @rynorris @mcheah @robert3005 ## How was this patch tested? We added additional tests explicitly as part of the PR, and did additional manual testing on small YARN and k8s clusters (partially documented on [Track active shuffle by stage](https://github.com/palantir/spark/pull/446)). Then we successfully rolled this out for a small subset of workloads in production at Palantir, running entirely on kubernetes. Please review http://spark.apache.org/contributing.html before opening a pull request.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
