I'm running 2.1.2 on two nodes. I want heartbeat to manage 22 VMware VMs across the two nodes. In terms of heartbeat resources, each VM is: - drbd ocf master-slave resource - Filesystem ocf resource (XFS) - VM ocf resource (my own ocf script)
I'm looking for advice on how to group these resources since the all depend on each other. I'm testing with config very similar to the DRBD/HowTov2 example at: http://www.linux-ha.org/DRBD/HowTov2. I have a drbd master/slave resource (ms-drbd1), and then a group (group_vm1). The group contains and filesystem resource (vm1-fs) and my VM (vm1-vm) resource. I have an rsc_order contraint saying group_vm1 should only run where ms-drbd1 has been promoted. I also have a rsc_colocation constraint saying group_vm1 follows ms-drbd1. Finally I have a location constaint saying ms-drbd1 prefers node1. When testing this with two VMs (add ms-drbd2 and group_vm2, prefering node2), things don't always work out as planned. Sometimes, with only one node running, if I "/etc/init.d/heartbeat start" on node1, ms-drbd1 and group_vm1 will try to migrate over to node1, fail then return back to node2. It's not clear to me what's failing. I feel like sometimes I end up in a state where the drbd resource starts, but the filesystem doesn't and therefore the VM resource doesn't. Maybe I need a delay betweeb resource starts? Should I be grouping these differently? I'm going to be creating 22 of these "group of three" resources, with three constraints for each. Is there an easier set of XML to configure this? I want half to prefer one node, and half to prefer the other. Finally, if both nodes are up and group_vm1 failed to start on a node, will it retry later? Actually, that's more important to me in the single node case as there is no other place for the failed resource to live. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
