[
https://issues.apache.org/jira/browse/SPARK-41060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Serhii Nesterov updated SPARK-41060:
------------------------------------
Attachment: Screenshot 2022-11-09 015432.png
> Spark Submitter generates a ConfigMap with the same name
> --------------------------------------------------------
>
> Key: SPARK-41060
> URL: https://issues.apache.org/jira/browse/SPARK-41060
> Project: Spark
> Issue Type: Bug
> Components: Kubernetes
> Affects Versions: 3.1.2, 3.3.0, 3.3.1
> Reporter: Serhii Nesterov
> Priority: Major
> Attachments: Screenshot 2022-11-09 015432.png
>
>
> *Description of the issue:*
> There's a problem with submitting spark jobs to K8s cluster: the library
> generates and reuses the same name for config maps (for drivers and
> executors). Ideally, for each job 2 config maps should be created: for a
> driver and an executor. However, the library creates only one driver config
> map for all jobs (in some cases it generates only one executor map for all
> jobs in the same manner). So, if I run 5 jobs, then only one driver config
> map will be generated and used for every job. During those runs we
> experience issues when deleting pods from the cluster: executors pods are
> endlessly created and immediately terminated overloading cluster resources.
>
> *The reason of the issue:*
> This problem occurs because of the *KubernetesClientUtils* class in which we
> have *configMapNameExecutor* and *configMapNameDriver* as constants. It seems
> to be incorrect and should be urgently fixed. I've prepared some changes for
> review to fix the issue (tested in the cluster of our project).
>
> *Steps to reproduce the issue:*
>
> # Create a *KubernetesClientApplication* object.
> # Submit at least 2 jobs (sequentially or using *Thread* for running in
> parallel).
>
> *The results of my observations according to the steps are as follows:*
> # Spark 3.1.2 - The same config map in K8S will be overwritten which means
> all the jobs will point to the same config map.
> # Spark 3.3.* - For the first job a new config map will be created. For
> other jobs an exception will be thrown (the K8S Fabric library does not allow
> to create a new config map with the existing name).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]