Hi folks, I'm testing iface-bridge resource support on a Linux KVM on System Z pacemaker cluster.
pacemaker-1.1.13-10.el7_2.ibm.1.s390x corosync-2.3.4-7.el7_2.ibm.1.s390x I created an iface-bridge resource, but specified a non-existent bridge_slaves value, vlan1292 (i.e. vlan1292 doesn't exist). [root@zs95kj VD]# date;pcs resource create br0_r1 ocf:heartbeat:iface-bridge bridge_name=br0 bridge_slaves=vlan1292 op monitor timeout="20s" interval="10s" --disabled Wed Feb 1 17:49:16 EST 2017 [root@zs95kj VD]# [root@zs95kj VD]# pcs resource show |grep br0 br0_r1 (ocf::heartbeat:iface-bridge): FAILED zs93kjpcs1 [root@zs95kj VD]# As you can see, the resource was created, but failed to start on the target node zs93kppcs1. To my surprise, the target node zs93kppcs1 was unceremoniously fenced. pacemaker.log shows a fence (off) action initiated against that target node, "because of resource failure(s)" : Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2719 ) debug: determine_op_status: br0_r1_stop_0 on zs93kjpcs1 returned 'not configured' (6) instead of the expected value: 'ok' (0) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2602 ) warning: unpack_rsc_op_failure: Processing failed op stop for br0_r1 on zs93kjpcs1: not configured (6) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:3244 ) error: unpack_rsc_op: Preventing br0_r1 from re-starting anywhere: operation stop failed 'not configured' (6) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2719 ) debug: determine_op_status: br0_r1_stop_0 on zs93kjpcs1 returned 'not configured' (6) instead of the expected value: 'ok' (0) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2602 ) warning: unpack_rsc_op_failure: Processing failed op stop for br0_r1 on zs93kjpcs1: not configured (6) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:3244 ) error: unpack_rsc_op: Preventing br0_r1 from re-starting anywhere: operation stop failed 'not configured' (6) Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:96 ) warning: pe_fence_node: Node zs93kjpcs1 will be fenced because of resource failure(s) Thankfully, I was able to successfully create a iface-bridge resource when I changed the bridge_slaves value to an existent vlan interface. My main concern is, why would the response to a failed bridge config operation warrant a node fence (off) action? Isn't it enough to just fail the resource and try another cluster node, or at most, give up if it can't be started / configured on any node? Is there any way to control this harsh recovery action in the cluster? Thanks much.. Scott Greenlese ... IBM KVM on System Z Solutions Test, Poughkeepsie, N.Y. INTERNET: [email protected]
_______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
