[ClusterLabs] Resources suddenly get target-role="stopped"

Klechomir Fri, 04 Dec 2015 08:25:35 -0800

Hi list,
My issue is the following:

I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8
(observed the same problem with Corosync 2.3.5  & Pacemaker 1.1.13-rc3)

Bumped on this issue when started playing with VirtualDomain resources,but this seems to be unrelated to the RA.

The problem is that without apparent reason a resource getstarget-role="Stopped". This happens after (successful) migration, orafter failover., or after VM restart .

My tests showed that changing the resource name fixes this problem, butthis seems to be a temporary workaround.


The resource configuration is:
primitive VMA_VM1 ocf:heartbeat:VirtualDomain \

params config="/NFSvolumes/CDrive1/VM1/VM1.xml"hypervisor="qemu:///system" migration_transport="tcp" \

        meta allow-migrate="true" target-role="Started" \
        op start interval="0" timeout="120s" \
        op stop interval="0" timeout="120s" \
        op monitor interval="10" timeout="30" depth="0" \
        utilization cpu="1" hv_memory="925"
order VM_VM1_after_Filesystem_CDrive1 inf: Filesystem_CDrive1 VMA_VM1

Here is the log from one such stop, after successful migration with "crmmigrate resource VMA_VM1":

Dec 04 15:18:22 [3818929] CLUSTER-1 crmd: debug: cancel_op:Cancelling op 5564 for VMA_VM1 (VMA_VM1:5564)Dec 04 15:18:22 [4434] CLUSTER-1 lrmd: info:cancel_recurring_action: Cancelling operation VMA_VM1_monitor_10000Dec 04 15:18:23 [3818929] CLUSTER-1 crmd: debug: cancel_op:Op 5564 for VMA_VM1 (VMA_VM1:5564): cancelledDec 04 15:18:23 [3818929] CLUSTER-1 crmd: debug:do_lrm_rsc_op: Performingkey=351:199:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_migrate_to_0VirtualDomain(VMA_VM1)[1797698]: 2015/12/04_15:18:23 DEBUG:Virtual domain VM1 is currently running.VirtualDomain(VMA_VM1)[1797698]: 2015/12/04_15:18:23 INFO: VM1:Starting live migration to CLUSTER-2 (using virsh--connect=qemu:///system --quiet migrate --live VM1qemu+tcp://CLUSTER-2/system ).Dec 04 15:18:24 [3818929] CLUSTER-1 crmd: info:process_lrm_event: LRM operation VMA_VM1_monitor_10000 (call=5564,status=1, cib-update=0, confirmed=false) CancelledDec 04 15:18:24 [3818929] CLUSTER-1 crmd: debug:update_history_cache: Updating history for 'VMA_VM1' with monitor opVirtualDomain(VMA_VM1)[1797698]: 2015/12/04_15:18:26 INFO: VM1:live migration to CLUSTER-2 succeeded.Dec 04 15:18:26 [4434] CLUSTER-1 lrmd: debug:operation_finished: VMA_VM1_migrate_to_0:1797698 - exited with rc=0Dec 04 15:18:26 [4434] CLUSTER-1 lrmd: notice:operation_finished: VMA_VM1_migrate_to_0:1797698 [2015/12/04_15:18:23 INFO: VM1: Starting live migration to CLUSTER-2(using virsh --connect=qemu:///system --quiet migrate --live VM1qemu+tcp://CLUSTER-2/system ). ]Dec 04 15:18:26 [4434] CLUSTER-1 lrmd: notice:operation_finished: VMA_VM1_migrate_to_0:1797698 [2015/12/04_15:18:26 INFO: VM1: live migration to CLUSTER-2 succeeded. ]Dec 04 15:18:27 [3818929] CLUSTER-1 crmd: debug:create_operation_update: do_update_resource: Updating resouceVMA_VM1 after complete migrate_to op (interval=0)Dec 04 15:18:27 [3818929] CLUSTER-1 crmd: notice:process_lrm_event: LRM operation VMA_VM1_migrate_to_0 (call=5697,rc=0, cib-update=89, confirmed=true) okDec 04 15:18:27 [3818929] CLUSTER-1 crmd: debug:update_history_cache: Updating history for 'VMA_VM1' withmigrate_to opDec 04 15:18:31 [3818929] CLUSTER-1 crmd: debug: cancel_op:Operation VMA_VM1:5564 already cancelledDec 04 15:18:31 [3818929] CLUSTER-1 crmd: debug:do_lrm_rsc_op: Performingkey=225:200:0:fb6e486a-023a-4b44-83cf-4c0c208a0f56 op=VMA_VM1_stop_0VirtualDomain(VMA_VM1)[1798719]: 2015/12/04_15:18:31 DEBUG:Virtual domain VM1 is not running: failed to get domain 'vm1' domainnot found: no domain with matching name 'vm1'VirtualDomain(VMA_VM1)[1798719]: 2015/12/04_15:18:31 INFO: DomainVM1 already stopped.Dec 04 15:18:31 [4434] CLUSTER-1 lrmd: debug:operation_finished: VMA_VM1_stop_0:1798719 - exited with rc=0Dec 04 15:18:31 [4434] CLUSTER-1 lrmd: notice:operation_finished: VMA_VM1_stop_0:1798719 [ 2015/12/04_15:18:31INFO: Domain VM1 already stopped. ]Dec 04 15:18:32 [3818929] CLUSTER-1 crmd: debug:create_operation_update: do_update_resource: Updating resouceVMA_VM1 after complete stop op (interval=0)Dec 04 15:18:32 [3818929] CLUSTER-1 crmd: notice:process_lrm_event: LRM operation VMA_VM1_stop_0 (call=5701, rc=0,cib-update=90, confirmed=true) okDec 04 15:18:32 [3818929] CLUSTER-1 crmd: debug:update_history_cache: Updating history for 'VMA_VM1' with stop opDec 04 15:20:58 [3818929] CLUSTER-1 crmd: debug:create_operation_update: build_active_RAs: Updating resouce VMA_VM1after complete stop op (interval=0)Dec 04 15:20:58 [3818929] CLUSTER-1 crmd: debug:create_operation_update: build_active_RAs: Updating resouce VMA_VM1after complete monitor op (interval=0)Dec 04 15:23:31 [1833996] CLUSTER-1 crm_resource: debug:process_orphan_resource: Detected orphan resource VMA_VM1 on CLUSTER-2


Any suggestions are welcome.

Best regards,
Klecho

_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Resources suddenly get target-role="stopped"

Reply via email to