liangyouze commented on PR #45911:
URL: https://github.com/apache/spark/pull/45911#issuecomment-2071259936

   > Thank you for making a PR, but I'm not sure if this is a right layer to 
do. For me, it sounds like you are hitting your K8s cluster issue or K8s client 
library issue. Could you elaborate your environment and the error message more, 
@liangyouze ?
   > 
   > > When creating Kubernetes resources, we occasionally encounter situations 
where resources such as ConfigMap cannot be successfully created, resulting in 
the driver pod remaining in the 'ContainerCreating' state. Therefore, it is 
necessary to add a verification mechanism after creating other resources to 
ensure that the resources are actually created
   
   It's the same as described in SPARK-44050,I've encountered the same issue. 
When creating resources such as configmaps, occasionally this situation occurs: 
the code does not throw any exceptions, but the configmap resource is not 
actually created, causing the driver pod to remain in a ContainerCreating state 
and unable to proceed to the next step. This may be a Kubernetes issue, or a 
feature (as far as I know, Kubernetes has some rate-limiting policies that may 
cause certain requests to be dropped, but I'm not sure if it's related), but in 
any case, Spark should not be stuck because of this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to