Re: [Linux-HA] How to painlessly change depended upon resource groups?
Ferenc Wagner wf...@niif.hu writes: Arnold Krille arn...@arnoldarts.de writes: If I understand you correctly, the problem only arises when adding new bridges while the cluster is running. And your vms will (rightfully) get restarted when you add a non-running bridge-resource to the cloned dependency-group. Exactly. You might be able to circumvent this problem: Define the bridge as a single cloned resource and start it. When it runs on all nodes, I wonder if there's a better way to check this than doing periodic set compares between the output of crm_node --partition and crm_resource --quiet --resource=temporary-bridge-clone --locate... remove the clones for the single resource and add the resource to your dependency-group in one single edit. With commit the cluster should see that the new resource in the group is already running and thus not affect the vms. Thanks, this sounds like a plan, I'll test it! Turns out it works indeed, but the interface is still unfortunate: after removing the temporary clone and moving the new bridge into the group, crm_simulate still predicts the restart of all VMs. This does not happen, though, because the new resource is already running, but it's a pity that's not shown beforehand. Anyway, this trick at least gets rid of the manual starting work. -- Cheers, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] How to painlessly change depended upon resource groups?
Hi, On Fri, 23 Aug 2013 10:41:21 +0200 Ferenc Wagner wf...@niif.hu wrote: Arnold Krille arn...@arnoldarts.de writes: On a side-note: I made the (sad) experience that its easier to configure such stuff outside of pacemaker/corosync and use the cluster only for the reliable ha things. What do you mean by reliable? What did you experience (if you can put it in a few sentences)? I did use pacemaker to manage several things: 1) drbd and vms and the dependencies between them. And the clones service of libvirt. 2) drbd for a directory of configuration for central services like dhcp, named and an apache2 for some services. These where in one group depending on the drbd-master. The second turned out to be less optimal than I thought: Everytime you want to change something on the dhcp or named configuration, you first have to check on which node its active. Then when you restart dhcp, it has to be done through pacemaker. And you shouldn't be in the shared-config directory with your shell as otherwise pacemaker might decide to move the resource on the restart and fail because the directory can't be unmounted and then fail the cluster and/or fence the node. With desastrous results for the terminal-server-vm running on that node affecting all the co-workers... I already dropped the central apache2 and named and replaced them by an instance configured by chef to be the same on all nodes. Additionally the services aren't controlled by pacemaker anymore. So named is still available when I shut down the cluster for maintainance. And being able to run named on all three nodes is better then running it only on the two nodes sharing the configuration-drbd. Next thing to do is have dhcp configured by chef and take out of this group. Then the group will be empty:-) Additionally pacemaker gets slower in its behaviour the more resources you have. And when its 10-15 virtual machines each with one or more drbd-resources, well, currently its 63 resources pacemaker is watching... tl;dr: pacemaker is pretty cool at making sure the defined resources are running. But the simplier your resources are, the better. One vm depending on one or two drbd-masters is great. Synchronizing configuration and managing complicated dependencies can be done with pacemaker but there are better things to spend your time with. Configuring several systems into a sane state is more a job for configuration-management such as chef, puppet or at least csync2 (to sync the configs). I'm not a big fan of configuration management systems, but they probably have their place. None is present in the current setup, though, so setting one up for bridge configuration seemed more complicated than extending the cluster. We'll see... While automation has its advantage just by the fact that is automation, what made it appealing for us is the repeatability. If the automation has proven to work once in your setup, its easily portable to the clients setup. Even more so if the initial prove of working was in your test-setup and then re-used in your own production setup. And then re-used on the clients network... Take a look at Chef or Puppet or Ansible, its worth the time. Have fun, Arnold PS: Sorry, it became it bit longer. signature.asc Description: PGP signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] How to painlessly change depended upon resource groups?
Arnold Krille arn...@arnoldarts.de writes: If I understand you correctly, the problem only arises when adding new bridges while the cluster is running. And your vms will (rightfully) get restarted when you add a non-running bridge-resource to the cloned dependency-group. Exactly. You might be able to circumvent this problem: Define the bridge as a single cloned resource and start it. When it runs on all nodes, remove the clones for the single resource and add the resource to your dependency-group in one single edit. With commit the cluster should see that the new resource in the group is already running and thus not affect the vms. Thanks, this sounds like a plan, I'll test it! On a side-note: I made the (sad) experience that its easier to configure such stuff outside of pacemaker/corosync and use the cluster only for the reliable ha things. What do you mean by reliable? What did you experience (if you can put it in a few sentences)? Configuring several systems into a sane state is more a job for configuration-management such as chef, puppet or at least csync2 (to sync the configs). I'm not a big fan of configuration management systems, but they probably have their place. None is present in the current setup, though, so setting one up for bridge configuration seemed more complicated than extending the cluster. We'll see... -- Regards, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] How to painlessly change depended upon resource groups?
Hi, I built a Pacemaker cluster to manage virtual machines (VMs). Storage is provided by cLVM volume groups, network access is provided by software bridges. I wanted to avoid maintaining precise VG and bridge dependencies, so I created two cloned resource groups: group storage dlm clvmd vg-vm vg-data group network br150 br151 I cloned these groups and thus every VM resource uniformly got two these two dependencies only, which makes it easy to add new VM resources: colocation cl-elm-network inf: vm-elm network-clone colocation cl-elm-storage inf: vm-elm storage-clone order o-elm-network inf: network-clone vm-elm order o-elm-storage inf: storage-clone vm-elm Of course the network and storage groups do not even model their internal dependencies correctly, as the different VGs and bridges are independent and unordered, but this is not a serious limitation in my case. The problem is, if I want to extend for example the network group by a new bridge, the cluster wants to restart all running VM resources while starting the new bridge. I get this info by changing a shadow copy of the CIB and crm_simulate --run --live-check on it. This is perfectly understandable due to the strict ordering and colocation constraints above, but undesirable in these cases. The actual restarts are avoidable by putting the cluster in maintenance mode beforehand, starting the bridge on each node manually, changing the configuration and moving the cluster out of maintenance mode, but this is quite a chore, and I did not find a way to make sure everything would be fine, like seeing the planned cluster actions after the probes for the new bridge resource are run (when there should not be anything left to do). Is there a way to regain my peace of mind during such operations? Or is there at least a way to order the cluster to start the new bridge clones so that I don't have to invoke the resource agent by hand on each node, thus reducing possible human mistakes? The bridge configuration was moved into the cluster to avoid having to maintain it in each node's OS separately. The network and storage resource groups provide a great concise status output with only the VM resources expanded. These are bonuses, but not requirements; if sensible maintenance is not achievable with this setup, everything is subject to change. Actually, I'm starting to feel that simplifying the VM dependencies may not be viable in the end, but wanted to ask for outsider ideas before overhauling the whole configuration. -- Thanks in advance, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] How to painlessly change depended upon resource groups?
Hi, On Thu, 22 Aug 2013 18:22:50 +0200 Ferenc Wagner wf...@niif.hu wrote: I built a Pacemaker cluster to manage virtual machines (VMs). Storage is provided by cLVM volume groups, network access is provided by software bridges. I wanted to avoid maintaining precise VG and bridge dependencies, so I created two cloned resource groups: group storage dlm clvmd vg-vm vg-data group network br150 br151 I cloned these groups and thus every VM resource uniformly got two these two dependencies only, which makes it easy to add new VM resources: colocation cl-elm-network inf: vm-elm network-clone colocation cl-elm-storage inf: vm-elm storage-clone order o-elm-network inf: network-clone vm-elm order o-elm-storage inf: storage-clone vm-elm Of course the network and storage groups do not even model their internal dependencies correctly, as the different VGs and bridges are independent and unordered, but this is not a serious limitation in my case. The problem is, if I want to extend for example the network group by a new bridge, the cluster wants to restart all running VM resources while starting the new bridge. I get this info by changing a shadow copy of the CIB and crm_simulate --run --live-check on it. This is perfectly understandable due to the strict ordering and colocation constraints above, but undesirable in these cases. The actual restarts are avoidable by putting the cluster in maintenance mode beforehand, starting the bridge on each node manually, changing the configuration and moving the cluster out of maintenance mode, but this is quite a chore, and I did not find a way to make sure everything would be fine, like seeing the planned cluster actions after the probes for the new bridge resource are run (when there should not be anything left to do). Is there a way to regain my peace of mind during such operations? Or is there at least a way to order the cluster to start the new bridge clones so that I don't have to invoke the resource agent by hand on each node, thus reducing possible human mistakes? The bridge configuration was moved into the cluster to avoid having to maintain it in each node's OS separately. The network and storage resource groups provide a great concise status output with only the VM resources expanded. These are bonuses, but not requirements; if sensible maintenance is not achievable with this setup, everything is subject to change. Actually, I'm starting to feel that simplifying the VM dependencies may not be viable in the end, but wanted to ask for outsider ideas before overhauling the whole configuration. If I understand you correctly, the problem only arises when adding new bridges while the cluster is running. And your vms will (rightfully) get restarted when you add a non-running bridge-resource to the cloned dependency-group. You might be able to circumvent this problem: Define the bridge as a single cloned resource and start it. When it runs on all nodes, remove the clones for the single resource and add the resource to your dependency-group in one single edit. With commit the cluster should see that the new resource in the group is already running and thus not affect the vms. On a side-note: I made the (sad) experience that its easier to configure such stuff outside of pacemaker/corosync and use the cluster only for the reliable ha things. Configuring several systems into a sane state is more a job for configuration-management such as chef, puppet or at least csync2 (to sync the configs). Have fun, Arnold signature.asc Description: PGP signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems