[ https://issues.apache.org/jira/browse/SPARK-32067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Yu updated SPARK-32067: ----------------------------- Summary: [K8s] Pod template from subsequent submission inadvertently applies to ongoing submission (was: [K8s] Pod template from subsequently submission inadvertently applies to ongoing submission) > [K8s] Pod template from subsequent submission inadvertently applies to > ongoing submission > ----------------------------------------------------------------------------------------- > > Key: SPARK-32067 > URL: https://issues.apache.org/jira/browse/SPARK-32067 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 2.4.6, 3.0.0 > Reporter: James Yu > Priority: Minor > > THE BUG: > The bug is reproducible by spark-submit two different apps (app1 and app2) > with different executor pod templates (e.g., different labels) to K8s > sequentially, and with app2 launching while app1 is still ramping up all its > executor pods. The unwanted result is that some launched executor pods of > app1 end up having app2's executor pod template applied to them. > The root cause appears to be that app1's podspec-configmap got overwritten by > app2 during the overlapping launching periods because the configmap names of > the two apps are the same. This causes some app1's executor pods being ramped > up after app2 is launched to be inadvertently launched with the app2's pod > template. The issue can be seen as follows: > First, after submitting app1, you get these configmaps: > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 9m46s > default podspec-configmap 1 12m{code} > Then submit app2 while app1 is still ramping up its executors. The > podspec-confimap is modified by app2. > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app2-2222222222222222-driver-conf-map 1 10s > default podspec-configmap 1 13m57s{code} > > PROPOSED SOLUTION: > Properly prefix the podspec-configmap for each submitted app. > {code:java} > NAMESPACE NAME DATA AGE > default app1-1111111111111111-driver-conf-map 1 11m43s > default app2-2222222222222222-driver-conf-map 1 10s > default app1-1111111111111111-podspec-configmap 1 13m57s > default app2-2222222222222222-podspec-configmap 1 13m57s{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org