Github user ssaavedra commented on the issue:
https://github.com/apache/spark/pull/20383
Sorry, I hadn't answered yet because it seems my patch does not work
cleanly on 2.3. Many names were rewritten as part of the merge and some logic
on how the executor pods look up the configMap had changed. I'll have to take a
better look at it.
I already changed all `initcontainer` appearances for `initContainer` and
so on, according to the parameters at the Config.scala for kubernetes, but to
no avail yet. @foxish maybe you have a hint on where to look? It seems that the
new containers are still looking for the old ConfigMap.
That must happen due to some property in the Checkpoint getting restored by
the driver. Thus, the driver gets the correct ConfigMap as it is created by
spark-submit, but the executors don't because the driver restores the
Checkpoint and thereon the old property value is used to put in the ConfigMap
names (however the executors get named correctly).
This is an example execution on my test environment.
```
$ kubectl -nssaavedraspark get pod
spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver
spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1 -o json | jq '.items[]|
{"configMap": (.spec.volumes[] | select(.configMap?).configMap.name),
"appselector": .metadata.labels."spark-app-selector", "name": .metadata.name}'
```
```
{
"configMap": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-init-config",
"appselector": "spark-8be5e27c750e4384964fbcb93d7f4b1c",
"name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver"
}
{
"configMap": "spark-pi-2-59025c48a8483e749e02894b70fd371f-init-config",
"appselector": "spark-application-1517424700542",
"name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1"
}
```
I have already made these changes (besides what's in the PR):
```
--- a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
@@ -48,25 +48,27 @@ class Checkpoint(ssc: StreamingContext, val
checkpointTime: Time)
// Reload properties for the checkpoint application since user wants
to set a reload property
// or spark had changed its value and user wants to set it back.
val propertiesToReload = List(
"spark.yarn.app.id",
"spark.yarn.app.attemptId",
"spark.driver.host",
"spark.driver.bindAddress",
"spark.driver.port",
"spark.kubernetes.driver.pod.name",
"spark.kubernetes.executor.podNamePrefix",
- "spark.kubernetes.initcontainer.configMapName",
- "spark.kubernetes.initcontainer.configMapKey",
- "spark.kubernetes.initcontainer.downloadJarsResourceIdentifier",
- "spark.kubernetes.initcontainer.downloadJarsSecretLocation",
- "spark.kubernetes.initcontainer.downloadFilesResourceIdentifier",
- "spark.kubernetes.initcontainer.downloadFilesSecretLocation",
- "spark.kubernetes.initcontainer.remoteJars",
- "spark.kubernetes.initcontainer.remoteFiles",
- "spark.kubernetes.mountdependencies.jarsDownloadDir",
- "spark.kubernetes.mountdependencies.filesDownloadDir",
+ "spark.kubernetes.initContainer.configMapName",
+ "spark.kubernetes.initContainer.configMapKey",
+ // "spark.kubernetes.initContainer.remoteJars",
+ // "spark.kubernetes.initContainer.remoteFiles",
+ // "spark.kubernetes.mountDependencies.jarsDownloadDir",
+ // "spark.kubernetes.mountDependencies.filesDownloadDir",
+ // "spark.kubernetes.mountDependencies.timeout",
+ // "spark.kubernetes.mountDependencies.maxSimultaneousDownloads",
"spark.master",
"spark.yarn.jars",
"spark.yarn.keytab",
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]