On Mon, Jun 27, 2011 at 12:00:28PM +0200, Dominik Klein wrote: > On 06/27/2011 11:09 AM, Dejan Muhamedagic wrote: > > Hi Dominik, > > > > On Fri, Jun 24, 2011 at 03:50:40PM +0200, Dominik Klein wrote: > >> Hi Dejan, > >> > >> this way, the cluster never learns that it can't start a resource on > >> that node. > > > > This resource depends on shared storage. So, the cluster won't > > try to start it unless the shared storage resource is already > > running. This is something that needs to be specified using > > either a negative preference location constraint or asymmetrical > > cluster. There's no need for yet another mechanism (the extra > > parameter) built into the resource agent. It's really an > > overkill. > > As requested on IRC, I describe my setup and explain why I think this is > a regression. > > 2 node cluster with a bunch of drbd devices. > > Each /dev/drbdXX is used as a block device of a VM. The VMs > configuration files are not on shared storage but have to be copied > manually. > > So it happened that during configuration of a VM, the admin forgot to > copy the configuration file to node2. The machine's DRBD was configured > though. So the cluster decided to promote the VMs DRBD on node2 and then > start the master-colocated and ordered VM. > > With the agent before the mentioned patch, during probe of a newly > configured resource, the cluster would have learned that the VM is not > available on one of the nodes (ERR_INSTALLED), so it would never start > the resource there.
This is exactly the problem with shared storage setups, where such an exit code can prevent resource from ever being started on a node which is otherwise perfectly capable of running that resource. > Now it sees NOT_RUNNING on all nodes during probe and may decide to > start the VM on a node where it cannot run. But really, if a resource can _never_ run on a node, then there should be a negative location constraint or the cluster should be setup as asymmetrical. Now, I understand that in your case, it is actually due to the administrator's fault. > That, with the current > version of the agent, leads to a failed start, a failed stop during > recovery and therefore: an unnecessary stonith operation. > > With Dejan's patch, it would still see NOT_RUNNING during probe, but at > least the stop would succeed. So the difference to the old version would > be that we had an unnecessary failed start on the node that does not > have the VM but it would not harm the node and I'd be fine with applying > that patch. > > There's a case though that might stop the vm from running (for an amount > of time). And that is if start-failure-is-fatal is false. Then we would > have $migration-threshold "failed start/succeeded stop" iterations while > the VMs service would not be running. > > Of course I do realize that the initial fault is a human one. but the > cluster used to protect from this, does not any more and that's why I > think this is a regression. > > I think the correct way to fix this is to still return ERR_INSTALLED > during probe unless the cluster admin configures that the VMs config is > on shared storage. Finding out about resource states on different nodes > is what the probe was designed to do, was it not? And we work around > that in this resource agent just to support certain setups. This particular setup is a special case of shared storage. The images are on shared storage, but the configurations are local. I think that you really need to make sure that the configurations are present where they need to be. Best would be that the configuration is kept on the storage along with the corresponding VM image. Since you're using a raw device as image, that's obviously not possible. Otherwise, use csync2 or similar to keep files in sync. Cheers, Dejan > Regards > Dominik > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/