This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ebb1975ae4c1 [SPARK-49447][K8S] Fix
`spark.kubernetes.allocation.batch.delay` to prevent small values less than 100
ebb1975ae4c1 is described below
commit ebb1975ae4c16da16c61b4a58159d7e7da65f1d2
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Wed Aug 28 19:46:52 2024 -0700
[SPARK-49447][K8S] Fix `spark.kubernetes.allocation.batch.delay` to prevent
small values less than 100
### What changes were proposed in this pull request?
This PR aims to fix `spark.kubernetes.allocation.batch.delay` to prevent
small values less than 100 from Apache Spark 4.0.0.
### Why are the changes needed?
The default value is `1s` (=1000). Usually, a small value like `1` happens
due to the missing unit, `s`, when users do mistakes. We had better prevent
this. In addition, a misconfigured value like `1` can cause a high frequency
traffic from Spark drivers to K8s control plan accidentally.
### Does this PR introduce _any_ user-facing change?
For the misconfigured values, Spark will complain and fail from Apache
Spark 4.0.0.
### How was this patch tested?
Pass the CIs with the newly added test case.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #47913 from dongjoon-hyun/SPARK-49447.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala | 2 +-
.../spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
index 2f9ee6943fe6..393ffc567401 100644
---
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
+++
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
@@ -457,7 +457,7 @@ private[spark] object Config extends Logging {
.doc("Time to wait between each round of executor allocation.")
.version("2.3.0")
.timeConf(TimeUnit.MILLISECONDS)
- .checkValue(value => value > 0, "Allocation batch delay must be a
positive time value.")
+ .checkValue(value => value > 100, "Allocation batch delay must be
greater than 0.1s.")
.createWithDefaultString("1s")
val KUBERNETES_ALLOCATION_DRIVER_READINESS_TIMEOUT =
diff --git
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala
index 093f5ef3bcb7..1ad5e0af0bd7 100644
---
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala
+++
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala
@@ -150,6 +150,14 @@ class ExecutorPodsAllocatorSuite extends SparkFunSuite
with BeforeAndAfter {
when(persistentVolumeClaimList.getItems).thenReturn(Seq.empty[PersistentVolumeClaim].asJava)
}
+ test("SPARK-49447: Prevent small values less than 100 for batch delay") {
+ val m = intercept[IllegalArgumentException] {
+ val conf = new SparkConf().set(KUBERNETES_ALLOCATION_BATCH_DELAY.key,
"1")
+ conf.get(KUBERNETES_ALLOCATION_BATCH_DELAY)
+ }.getMessage
+ assert(m.contains("Allocation batch delay must be greater than 0.1s."))
+ }
+
test("SPARK-41210: Window based executor failure tracking mechanism") {
var _exitCode = -1
val _conf = conf.clone
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]