> On 13 Feb 2015, at 8:38 pm, Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> > wrote: > > Hello! > > I have some questions on pacemakers's resource migration. We have a Xen host > that has some problems (still to be investigated) that causes some VM disk > not be be ready for use. > > When tyring to migrate a VM frem the bad host to a good host through > pacemaker, migration seemed to hang. At some state the "source VM" was no > longer present on the bad host (Unable to find domain 'v09'), but pacemaker > still tried a migration: > crmd[6779]: notice: te_rsc_command: Initiating action 100: migrate_from > prm_xen_v09_migrate_from_0 on h05 > Only after the timeout CRM realized that there is a problem: > crmd[6779]: warning: status_from_rc: Action 100 (prm_xen_v09_migrate_from_0) > on h05 failed (target: 0 vs. rc: 1): Error > After that CRM still stried a stop on the "source host" (h10) (and on the > destination host): > crmd[6779]: notice: te_rsc_command: Initiating action 98: stop > prm_xen_v09_stop_0 on h10 > crmd[6779]: notice: te_rsc_command: Initiating action 26: stop > prm_xen_v09_stop_0 on h05 > > Q1: Is this the way it should work?
Mostly, but the agent should have detected the condition earlier and returned an error (instead of timing out). > > Before that we had the same situation (thae bad host had been set to > "standby") when someone tired of waiting so long destroyed the affected Xen > VMS on the source host while the cluster was migrating. Eventually the VMs > came up (restarted instead of being live migrated) on the good hosts. > > Then we shutdown OpenAIS on the bad host, installed updates and rebooted the > bad host (during reboot OpenAIS was started (still standby)). > To my surprise pacemaker thought the VMS were still running on the bad host > and initiated a migration. That would be coming from the resource agent. > As there were no source VMs on the bad host, but alle the affected VMs were > running on some good host, CRM stutdown the VMs on the good hostss, just to > restart them. > > Q2: Ist this expected behavior? I can hardly believe! Nope, fix the agent :) > > Software is SLES11 SP3 with pacemaker-1.1.11-0.7.53 (and related) on all > hosts. > > Regards, > Ulrich > > > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems