Hi Dave, Am Sa., 18. Aug. 2018 um 17:01 Uhr schrieb David P Grove <gro...@us.ibm.com >:
> > [ Discussion about cluster singleton or not for the ContainerManager] > > fwiw, I believe for Kubernetes we do not need to attempt to deal with fault > tolerance for the ContainerManager state ourselves. We can use labels to > replicate all the persistent metadata for a container (prewarm or not, the > ContainerRouter it is assigned to) in the Kube objects representing the > pods in Kube's etcd metadata server. If we need to restart a > ContainerManager, the new instance can come up "instantly" and start > servicing requests while recovering the state of the previous instance via > querries against etcd to discover the pre-existing containers it owned. > Note that there is also state it has about ContainerRouters (how many are there and which own which containers). We could make that queryable as well, so as soon as a fallback happens, the fallback component queries the state of all routers to get into consistent state. I agree we should replicate as little state as possible and in the Kubernetes case, we already have state about containers through pods and their labels. > > We'll need to validate the performance of this is acceptable (should be, > since it is just some asynchronous labeling operations when (a) the > container is created and (b) on the initial transition from stemcell to > warm), but it is going to be pretty simple to implement and makes good > usage of the underlying platform's capabilities. > Agreed, good point. > > --dave >