Repository: spark Updated Branches: refs/heads/master a6f37b074 -> 497f00f62
[SPARK-23200] Reset Kubernetes-specific config on Checkpoint restore Several configuration parameters related to Kubernetes need to be reset, as they are changed with each invokation of spark-submit and thus prevents recovery of Spark Streaming tasks. ## What changes were proposed in this pull request? When using the Kubernetes cluster-manager and spawning a Streaming workload, it is important to reset many spark.kubernetes.* properties that are generated by spark-submit but which would get rewritten when restoring a Checkpoint. This is so, because the spark-submit codepath creates Kubernetes resources, such as a ConfigMap, a Secret and other variables, which have an autogenerated name and the previous one will not resolve anymore. In short, this change enables checkpoint restoration for streaming workloads, and thus enables Spark Streaming workloads in Kubernetes, which were not possible to restore from a checkpoint before if the workload went down. ## How was this patch tested? This patch needs would benefit from testing in different k8s clusters. This is similar to the YARN related code for resetting a Spark Streaming workload, but for the Kubernetes scheduler. This PR removes the initcontainers properties that existed before because they are now removed in master. For a previous discussion, see the non-rebased work at: apache-spark-on-k8s#516 Closes #22392 from ssaavedra/fix-checkpointing-master. Authored-by: Santiago Saavedra <[email protected]> Signed-off-by: Yinan Li <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/497f00f6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/497f00f6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/497f00f6 Branch: refs/heads/master Commit: 497f00f62b3ddd1f40507fdfe10f30cd9effb6cf Parents: a6f37b0 Author: Santiago Saavedra <[email protected]> Authored: Tue Sep 18 22:08:50 2018 -0700 Committer: Yinan Li <[email protected]> Committed: Tue Sep 18 22:08:50 2018 -0700 ---------------------------------------------------------------------- .../src/main/scala/org/apache/spark/streaming/Checkpoint.scala | 4 ++++ 1 file changed, 4 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/497f00f6/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala ---------------------------------------------------------------------- diff --git a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala index 3703a87..a882558 100644 --- a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala @@ -54,6 +54,8 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time) "spark.driver.bindAddress", "spark.driver.port", "spark.master", + "spark.kubernetes.driver.pod.name", + "spark.kubernetes.executor.podNamePrefix", "spark.yarn.jars", "spark.yarn.keytab", "spark.yarn.principal", @@ -64,6 +66,8 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time) .remove("spark.driver.host") .remove("spark.driver.bindAddress") .remove("spark.driver.port") + .remove("spark.kubernetes.driver.pod.name") + .remove("spark.kubernetes.executor.podNamePrefix") val newReloadConf = new SparkConf(loadDefaults = true) propertiesToReload.foreach { prop => newReloadConf.getOption(prop).foreach { value => --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
