(spark) branch branch-4.0 updated: [SPARK-50996][K8S] Increase `spark.kubernetes.allocation.batch.size` to 10

dongjoon Sun, 26 Jan 2025 16:49:14 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.0 by this push:
     new 0ca197589458 [SPARK-50996][K8S] Increase 
`spark.kubernetes.allocation.batch.size` to 10
0ca197589458 is described below

commit 0ca197589458cc5413ab0baca78b3c12184aa9ae
Author: Dongjoon Hyun <dongj...@apache.org>
AuthorDate: Sun Jan 26 16:47:42 2025 -0800

    [SPARK-50996][K8S] Increase `spark.kubernetes.allocation.batch.size` to 10
    
    ### What changes were proposed in this pull request?
    
    This PR aims to increase `spark.kubernetes.allocation.batch.size` to 10 
from 5 in Apache Spark 4.0.0.
    
    ### Why are the changes needed?
    
    Since Apache Spark 2.3.0, Apache Spark uses `5` as the default value of 
executor allocation batch size for 8 years conservatively.
    - https://github.com/apache/spark/pull/19468
    
    Given that the improvement of K8s hardware infrastructure for last 8 year, 
we had better use a bigger value, `10`, from Apache Spark 4.0.0 in 2025.
    
    Technically, when we request 1200 executor pod,
    - Batch Size `5` takes 4 minutes.
    - Batch Size `10` takes 2 minutes.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, the users will see faster Spark job resource allocation. The migration 
guide is updated correspondingly.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #49681 from dongjoon-hyun/SPARK-50996.
    
    Authored-by: Dongjoon Hyun <dongj...@apache.org>
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
    (cherry picked from commit 9da1cd02951bb165d03cf0006023b55e5681fe9d)
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
---
 docs/core-migration-guide.md                                            | 2 ++
 docs/running-on-kubernetes.md                                           | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala        | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 9dcf4ad8a298..7f92fa14f170 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -36,6 +36,8 @@ license: |
 
 - In Spark 4.0, support for Apache Mesos as a resource manager was removed.
 
+- Since Spark 4.0, Spark will allocate executor pods with a batch size of 
`10`. To restore the legacy behavior, you can set 
`spark.kubernetes.allocation.batch.size` to `5`.
+
 - Since Spark 4.0, Spark uses `ReadWriteOncePod` instead of `ReadWriteOnce` 
access mode in persistence volume claims. To restore the legacy behavior, you 
can set `spark.kubernetes.legacy.useReadWriteOnceAccessMode` to `true`.
 
 - Since Spark 4.0, Spark reports its executor pod status by checking all 
containers of that pod. To restore the legacy behavior, you can set 
`spark.kubernetes.executor.checkAllContainers` to `false`.
diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index c7f5d67a6cd8..bd5e1956a627 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -682,7 +682,7 @@ See the [configuration page](configuration.html) for 
information on Spark config
 </tr>
 <tr>
   <td><code>spark.kubernetes.allocation.batch.size</code></td>
-  <td><code>5</code></td>
+  <td><code>10</code></td>
   <td>
     Number of pods to launch at once in each round of executor pod allocation.
   </td>
diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
index db7fc85976c2..4467f73e7056 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
@@ -450,7 +450,7 @@ private[spark] object Config extends Logging {
       .version("2.3.0")
       .intConf
       .checkValue(value => value > 0, "Allocation batch size should be a 
positive integer")
-      .createWithDefault(5)
+      .createWithDefault(10)
 
   val KUBERNETES_ALLOCATION_BATCH_DELAY =
     ConfigBuilder("spark.kubernetes.allocation.batch.delay")


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-4.0 updated: [SPARK-50996][K8S] Increase `spark.kubernetes.allocation.batch.size` to 10

Reply via email to