StatefulSets are truly immune. Kubernetes creates/recreates the pods with stable pod hostnames that include the ordinal number (eg controller-0, controller-1, etc). If controller-0 exits for any reason, a new controller-0 is created to replace it. This ordinal number based name is also available to the pod spec in the helm chart/yaml as `metadata.name`.
DaemonSets are ensured that as long as the actual worker node is functional, that kubernetes will create/recreate a pod on that node and will set `spec.nodeName` to the ip address of the worker node. So when using DockerContainerFactory for the invokers, there is a slight gap: if a worker node becomes permanently disabled, then any work already assigned the invoker on that node will be stranded in its kafka topic. In practice, I don't think this gap is really worth putting much effort into fixing. The node can be rebooted/reimaged in a matter of minutes, which is almost always good enough to allow an invoker to be started on it again, which picks up the work. Furthermore, if the deployment is using the KubernetesContainerFactory for the invokers (OpenShift does), then there is no gap. In this configuration, we use a StatefulSet and as discussed with the controllers above, when elements of the StatefulSet are lost, they will be recreated with the missing ordinal number. So we ca n expect to always regain the expected number of invokers, with dense uniqueNames from 0 to N-1. [ Full content available at: https://github.com/apache/incubator-openwhisk/issues/3858 ] This message was relayed via gitbox.apache.org for [email protected]
