On Thu, Jul 6, 2017 at 6:34 AM, Nicola Ferraro <[email protected]> wrote:
> Hi, > I've read some discussions on fencing and pod guarantees. Most of them are > related to stateful sets, e.g. https://github.com/ > kubernetes/community/blob/master/contributors/design- > proposals/pod-safety.md and related threads. > Anyway, I couldn't find an answer to the following questions... > > Suppose I create a DeploymentConfig (so, no statefulsets) with replicas=1. > After a pod is scheduled on some node, that node is disconnected from the > cluster (I block all communications with the master). > After some time, the DC/RC tries to delete that pod and reschedule a new > pod on another node. > The RC doesn't delete the pod, but the node controller will (after X minutes). A new pod is created - the RC does *not* block waiting for old pods to be deleted before creating new ones. If the Pod references a PV that supports locking innately (GCE, AWS, Azure, Ceph, Gluster), then the second pod will *not* start up, because the volume can't be attached to the new node. But this behavior depends on the storage service itself, not on Kube. > > For what I've understood, if now I reconnect the failing node, the Kubelet > will read the cluster status and effectively delete the old pod, but, > before that moment, both pods were running in their respective nodes and > the old pod was allowed to access external resources (e.g. if the network > still allowed communication with them). > Yes > > Is this scenario possible? > Is there a mechanism by which a disconnected node can tear down its pods > automatically after a certain timeout? > Run a daemonset that shuts down the instance if it loses contact with the master API / health check for > X seconds. Even this is best effort. You can also run a daemon set that uses sanlock or another tool based on a shared RWM volume, and then self terminate if you lose the lock. Keep in mind these solutions aren't perfect, and it's always possible that a bug in sanlock or another node error prevents that daemon process from running to completion. > Is fencing implemented/going-to-be-implemented for normal pods, even if > they don't belong to stateful sets? > It's possible that we will add attach/detach controller support to control whether volumes that are RWO but don't have innate locking. It's also possible that someone will implement a fencer. It should be easy to implement a fencer today. > > Thanks, > Nicola > > _______________________________________________ > users mailing list > [email protected] > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
