StatefulSets are truly immune.  Kubernetes creates/recreates the pods with 
stable pod hostnames that include the ordinal number (eg controller-0, 
controller-1, etc).  If controller-0 exits for any reason, a new controller-0 
is created to replace it.  This ordinal number based name is also available to 
the pod spec in the helm chart/yaml as `metadata.name`. 

DaemonSets are ensured that as long as the actual worker node is functional, 
that kubernetes will create/recreate a pod on that node and will set 
`spec.nodeName` to the ip address of the worker node.  So when using 
DockerContainerFactory for the invokers, there is a slight gap:  if a worker 
node becomes permanently disabled, then any work already assigned the invoker 
on that node will be stranded in its kafka topic.  In practice, I don't think 
this gap is really worth putting much effort into fixing.  The node can be 
rebooted/reimaged in a matter of minutes, which is almost always good enough to 
allow an invoker to be started on it again, which picks up the work.  
Furthermore, if the deployment is using the KubernetesContainerFactory for the 
invokers (OpenShift does), then there is no gap.  In this configuration, we use 
a StatefulSet and as discussed with the controllers above, when elements of the 
StatefulSet are lost, they will be recreated with the missing ordinal number. 
So we ca
 n expect to always regain the expected number of invokers, with dense 
uniqueNames from 0 to N-1. 

[ Full content available at: 
https://github.com/apache/incubator-openwhisk/issues/3858 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to