Repository: spark
Updated Branches:
  refs/heads/branch-2.4 76514a015 -> ba8560a96


[SPARK-23200] Reset Kubernetes-specific config on Checkpoint restore

Several configuration parameters related to Kubernetes need to be
reset, as they are changed with each invokation of spark-submit and
thus prevents recovery of Spark Streaming tasks.

## What changes were proposed in this pull request?

When using the Kubernetes cluster-manager and spawning a Streaming workload, it 
is important to reset many spark.kubernetes.* properties that are generated by 
spark-submit but which would get rewritten when restoring a Checkpoint. This is 
so, because the spark-submit codepath creates Kubernetes resources, such as a 
ConfigMap, a Secret and other variables, which have an autogenerated name and 
the previous one will not resolve anymore.

In short, this change enables checkpoint restoration for streaming workloads, 
and thus enables Spark Streaming workloads in Kubernetes, which were not 
possible to restore from a checkpoint before if the workload went down.

## How was this patch tested?

This patch needs would benefit from testing in different k8s clusters.

This is similar to the YARN related code for resetting a Spark Streaming 
workload, but for the Kubernetes scheduler. This PR removes the initcontainers 
properties that existed before because they are now removed in master.

For a previous discussion, see the non-rebased work at: apache-spark-on-k8s#516

Closes #22392 from ssaavedra/fix-checkpointing-master.

Authored-by: Santiago Saavedra <[email protected]>
Signed-off-by: Yinan Li <[email protected]>
(cherry picked from commit 497f00f62b3ddd1f40507fdfe10f30cd9effb6cf)
Signed-off-by: Yinan Li <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba8560a9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ba8560a9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ba8560a9

Branch: refs/heads/branch-2.4
Commit: ba8560a96d39d6d5a3ccbb7e364e817603d8ca6c
Parents: 76514a0
Author: Santiago Saavedra <[email protected]>
Authored: Tue Sep 18 22:08:50 2018 -0700
Committer: Yinan Li <[email protected]>
Committed: Tue Sep 18 22:09:10 2018 -0700

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/streaming/Checkpoint.scala   | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ba8560a9/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
----------------------------------------------------------------------
diff --git 
a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala 
b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
index 3703a87..a882558 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
@@ -54,6 +54,8 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: 
Time)
       "spark.driver.bindAddress",
       "spark.driver.port",
       "spark.master",
+      "spark.kubernetes.driver.pod.name",
+      "spark.kubernetes.executor.podNamePrefix",
       "spark.yarn.jars",
       "spark.yarn.keytab",
       "spark.yarn.principal",
@@ -64,6 +66,8 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: 
Time)
       .remove("spark.driver.host")
       .remove("spark.driver.bindAddress")
       .remove("spark.driver.port")
+      .remove("spark.kubernetes.driver.pod.name")
+      .remove("spark.kubernetes.executor.podNamePrefix")
     val newReloadConf = new SparkConf(loadDefaults = true)
     propertiesToReload.foreach { prop =>
       newReloadConf.getOption(prop).foreach { value =>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to