GitHub user skonto opened a pull request:
https://github.com/apache/spark/pull/22405
[SPARK-25295][K8S]Fix executor names collision
## What changes were proposed in this pull request?
Fixes the collision issue with spak executor names in client mode.
Also fixes the issue with spark app name having spaces in cluster mode.
If you run the Spark Pi test in client mode it passes.
The tricky part is the user may set the app name:
https://github.com/apache/spark/blob/3030b82c89d3e45a2e361c469fbc667a1e43b854/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala#L30
If i do:
```
./bin/spark-submit --master k8s://http://127.0.0.1:8001 --deploy-mode
cluster --name "spark pi
...
```
it will fail as the app name is used for the prefix of driver's pod name
and it cannot have spaces (According to k8s conventions).
## How was this patch tested?
Manually by a running spark job in client mode.
To reproduce do:
```
kubectl create -f service.yaml
kubectl create -f pod.yaml
```
service.yaml :
```
kind: Service
apiVersion: v1
metadata:
name: spark-test-app-1-svc
spec:
clusterIP: None
selector:
spark-app-selector: spark-test-app-1
ports:
- protocol: TCP
name: driver-port
port: 7077
targetPort: 7077
- protocol: TCP
name: block-manager
port: 10000
targetPort: 10000
```
pod.yaml:
```
apiVersion: v1
kind: Pod
metadata:
name: spark-test-app-1
labels:
spark-app-selector: spark-test-app-1
spec:
containers:
- name: spark-test
image: skonto/spark:k8s-client-fix
imagePullPolicy: Always
command:
- 'sh'
- '-c'
- "/opt/spark/bin/spark-submit
--verbose
--master k8s://https://kubernetes.default.svc
--deploy-mode client
--class org.apache.spark.examples.SparkPi
--conf spark.app.name=spark
--conf spark.executor.instances=1
--conf
spark.kubernetes.container.image=skonto/spark:k8s-client-fix
--conf spark.kubernetes.container.image.pullPolicy=Always
--conf
spark.kubernetes.authenticate.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token
--conf
spark.kubernetes.authenticate.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
--conf spark.executor.memory=500m
--conf spark.executor.cores=1
--conf spark.executor.instances=1
--conf spark.driver.host=spark-test-app-1-svc.default.svc
--conf spark.driver.port=7077
--conf spark.driver.blockManager.port=10000
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0-SNAPSHOT.jar 1000000"
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/skonto/spark
fix-k8s-client-mode-executor-names
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22405.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22405
----
commit d6b4a9f4af8e7dd1552df4a9d6df51fe5b41f08b
Author: Stavros Kontopoulos <stavros.kontopoulos@...>
Date: 2018-09-12T19:47:57Z
fix executor names collision
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]