This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.0 by this push: new 0ca197589458 [SPARK-50996][K8S] Increase `spark.kubernetes.allocation.batch.size` to 10 0ca197589458 is described below commit 0ca197589458cc5413ab0baca78b3c12184aa9ae Author: Dongjoon Hyun <dongj...@apache.org> AuthorDate: Sun Jan 26 16:47:42 2025 -0800 [SPARK-50996][K8S] Increase `spark.kubernetes.allocation.batch.size` to 10 ### What changes were proposed in this pull request? This PR aims to increase `spark.kubernetes.allocation.batch.size` to 10 from 5 in Apache Spark 4.0.0. ### Why are the changes needed? Since Apache Spark 2.3.0, Apache Spark uses `5` as the default value of executor allocation batch size for 8 years conservatively. - https://github.com/apache/spark/pull/19468 Given that the improvement of K8s hardware infrastructure for last 8 year, we had better use a bigger value, `10`, from Apache Spark 4.0.0 in 2025. Technically, when we request 1200 executor pod, - Batch Size `5` takes 4 minutes. - Batch Size `10` takes 2 minutes. ### Does this PR introduce _any_ user-facing change? Yes, the users will see faster Spark job resource allocation. The migration guide is updated correspondingly. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49681 from dongjoon-hyun/SPARK-50996. Authored-by: Dongjoon Hyun <dongj...@apache.org> Signed-off-by: Dongjoon Hyun <dongj...@apache.org> (cherry picked from commit 9da1cd02951bb165d03cf0006023b55e5681fe9d) Signed-off-by: Dongjoon Hyun <dongj...@apache.org> --- docs/core-migration-guide.md | 2 ++ docs/running-on-kubernetes.md | 2 +- .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md index 9dcf4ad8a298..7f92fa14f170 100644 --- a/docs/core-migration-guide.md +++ b/docs/core-migration-guide.md @@ -36,6 +36,8 @@ license: | - In Spark 4.0, support for Apache Mesos as a resource manager was removed. +- Since Spark 4.0, Spark will allocate executor pods with a batch size of `10`. To restore the legacy behavior, you can set `spark.kubernetes.allocation.batch.size` to `5`. + - Since Spark 4.0, Spark uses `ReadWriteOncePod` instead of `ReadWriteOnce` access mode in persistence volume claims. To restore the legacy behavior, you can set `spark.kubernetes.legacy.useReadWriteOnceAccessMode` to `true`. - Since Spark 4.0, Spark reports its executor pod status by checking all containers of that pod. To restore the legacy behavior, you can set `spark.kubernetes.executor.checkAllContainers` to `false`. diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index c7f5d67a6cd8..bd5e1956a627 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -682,7 +682,7 @@ See the [configuration page](configuration.html) for information on Spark config </tr> <tr> <td><code>spark.kubernetes.allocation.batch.size</code></td> - <td><code>5</code></td> + <td><code>10</code></td> <td> Number of pods to launch at once in each round of executor pod allocation. </td> diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index db7fc85976c2..4467f73e7056 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -450,7 +450,7 @@ private[spark] object Config extends Logging { .version("2.3.0") .intConf .checkValue(value => value > 0, "Allocation batch size should be a positive integer") - .createWithDefault(5) + .createWithDefault(10) val KUBERNETES_ALLOCATION_BATCH_DELAY = ConfigBuilder("spark.kubernetes.allocation.batch.delay") --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org