liyinan926 edited a comment on issue #26687: [SPARK-30055][k8s] Allow configuration of restart policy for Kubernetes pods URL: https://github.com/apache/spark/pull/26687#issuecomment-566844904 My biggest concern is around changing the default `restartPolicy` of executor pods to `Always`, which doesn't sound appropriate for batch run-to-completion workloads like Spark jobs. By changing it to `Always`, you introduce two competing controls of executor pods: the driver that manages the lifecycle of executor pods, and kubelets that manage the restarts of them. This is particularly a problem if a user chooses to keep the executor pods around by setting `spark.kubernetes.executor.deleteOnTermination=false`. In this particular case, the executor pods will be restarted indefinitely because they will not be deleted upon completion. A default `restartPolicy` of `OnFailure` makes much more sense. With that being said, however, I do think that executor restart in this context is a concern of the Spark core and driver, instead of the kubelets. Also regarding the startup delay, the only thing you really save significantly is the latency introduced by the k8s scheduler by doing in-place restarts. We no longer use an init-container, so there's no latency of waiting for init-containers to complete.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
