> On 13 Feb 2015, at 8:38 pm, Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> 
> wrote:
> 
> Hello!
> 
> I have some questions on pacemakers's resource migration. We have a Xen host 
> that has some problems (still to be investigated) that causes some VM disk 
> not be be ready for use.
> 
> When tyring to migrate a VM frem the bad host to a good host through 
> pacemaker, migration seemed to hang. At some state the "source VM" was no 
> longer present on the bad host (Unable to find domain 'v09'), but pacemaker 
> still tried a migration:
> crmd[6779]:   notice: te_rsc_command: Initiating action 100: migrate_from 
> prm_xen_v09_migrate_from_0 on h05
> Only after the timeout CRM realized that there is a problem:
> crmd[6779]:  warning: status_from_rc: Action 100 (prm_xen_v09_migrate_from_0) 
> on h05 failed (target: 0 vs. rc: 1): Error
> After that CRM still stried a stop on the "source host" (h10) (and on the 
> destination host):
> crmd[6779]:   notice: te_rsc_command: Initiating action 98: stop 
> prm_xen_v09_stop_0 on h10
> crmd[6779]:   notice: te_rsc_command: Initiating action 26: stop 
> prm_xen_v09_stop_0 on h05
> 
> Q1: Is this the way it should work?

Mostly, but the agent should have detected the condition earlier and returned 
an error (instead of timing out). 

> 
> Before that we had the same situation (thae bad host had been set to 
> "standby") when someone tired of waiting so long destroyed the affected Xen 
> VMS on the source host while the cluster was migrating. Eventually the VMs 
> came up (restarted instead of being live migrated) on the good hosts.
> 
> Then we shutdown OpenAIS on the bad host, installed updates and rebooted the 
> bad host (during reboot OpenAIS was started (still standby)).
> To my surprise pacemaker thought the VMS were still running on the bad host 
> and initiated a migration.

That would be coming from the resource agent.

> As there were no source VMs on the bad host, but alle the affected VMs were 
> running on some good host, CRM stutdown the VMs on the good hostss, just to 
> restart them.
> 
> Q2: Ist this expected behavior? I can hardly believe!

Nope, fix the agent :)

> 
> Software is SLES11 SP3 with pacemaker-1.1.11-0.7.53 (and related) on all 
> hosts.
> 
> Regards,
> Ulrich
> 
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to