Hi, On Thu, Apr 16, 2009 at 10:11:26AM -0700, Ethan Bannister wrote: > > Perhaps someone may be able to give me a little insight on what I may be > doing wrong. I would like to have DRBD promote on secondary machine when > the Ethernet connection to the initiator on my SAN goes down. When I pull > the cable or bring eth0 down which IPaddr resides on, this is what crm_mon > shows me soon after: > > ============ > Last updated: Thu Apr 16 12:38:36 2009 > Current DC: init2.mydomain.com (1d3814dc-7928-4beb-99f6-c7ade09056a5) - > partition with quorum > Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9 > 4 Nodes configured, unknown expected votes > 8 Resources configured. > ============ > > Online: [ san2.mydomain.com init2.mydomain.com init1.mydomain.com ] > OFFLINE: [ san1.mydomain.com ] > > Resource Group: G_Target > R_IP_Target (ocf::heartbeat:IPaddr2): Started san2.mydomain.com > R_tgtd (ocf::acs:tgtdra): Started san2.mydomain.com > Master/Slave Set: ms-drbd0 > Masters: [ san2.mydomain.com ] > Stopped: [ drbd0:0 ] <---------- correct > Master/Slave Set: ms-drbd1 > Masters: [ san2.mydomain.com ] > Stopped: [ drbd1:1 ] <---------- incorrect > Master/Slave Set: ms-drbd2 > Masters: [ san2.mydomain.com ] > Stopped: [ drbd2:0 ] <---------- correct > Clone Set: pingd > Started: [ init1.mydomain.com init2.mydomain.com san2.mydomain.com ] > Stopped: [ R_pingd:2 ] > > Failed actions: > drbd1:1_promote_0 (node=san2.mydomain.com, call=43, rc=1, > status=complete): unknown error
Does drbd report any error in the logs (look form lrmd.*drbd)? This looks like a resource or a drbd RA issue. Thanks, Dejan > As you can see, drbd0 and drbd2 promote with no issues. But drbd1 is not > promoting properly. I have checked my constraints, and I have tweaked out > the start-delay settings, but nothing happens the way I would like. I have > two initiators for redundancy as well. But I want the initiator to stay up > if the network goes down on either target. This has been puzzling me for > some time now. Any help would be greatly appreciated. > Here is what I have for a crm cli config: > > node $id="cee46f54-d517-4e4d-b0b8-3076fbc5751b" san2.mydomain.com \ > attributes standby="off" > node $id="bde24914-1235-4dc4-8686-f05fd9e6a35e" san1.mydomain.com \ > attributes standby="off" > node $id="1d3814dc-7928-4beb-99f6-c7ade09056a5" init2.mydomain.com \ > attributes standby="off" > node $id="a058cd72-b27e-4593-ac7e-d79db0709c15" init1.mydomain.com \ > attributes standby="off" > primitive R_IP_Target ocf:heartbeat:IPaddr2 \ > params ip="192.168.*.*" \ > params nic="eth0" \ > params iflabel="1" \ > op monitor interval="30s" > primitive R_tgtd ocf:acs:tgtdra \ > op monitor interval="30s" \ > op start interval="0" timeout="30s" start-delay="2s" > primitive R_IP_Init ocf:heartbeat:IPaddr2 \ > params ip="192.168.*.*" \ > params nic="eth0" \ > params iflabel="1" \ > op monitor interval="30s" > primitive R_iscsi ocf:heartbeat:iscsi \ > params target="target1.mydomain.com:san.targets" \ > params portal="192.168.*.*" \ > op monitor interval="30s" \ > op start interval="0" timeout="30s" start-delay="5s" \ > meta is-managed="true" > primitive R_LVM ocf:heartbeat:LVM \ > params volgrpname="VolGroup01" \ > op monitor interval="30s" \ > op start interval="0" timeout="30s" start-delay="5s" \ > meta is-managed="true" > primitive R_Filesystem ocf:heartbeat:Filesystem \ > params device="/dev/VolGroup01/LogVol00" \ > params directory="/san_targets/www" \ > params fstype="ext3" \ > op monitor interval="30s" \ > op start interval="0" timeout="30s" start-delay="5s" > primitive R_NFS ocf:heartbeat:nfsserver \ > params nfs_init_script="/etc/init.d/nfs" \ > params nfs_notify_cmd="/sbin/rpc.statd" \ > params nfs_shared_infodir="/san_targets/www/nfsinfo" \ > op monitor interval="30s" > primitive drbd0 ocf:heartbeat:drbd \ > params drbd_resource="drbd0" \ > op monitor interval="29s" role="Master" timeout="30s" \ > op monitor interval="30s" role="Slave" timeout="30s" \ > op start interval="0" timeout="30s" start-delay="10s" > primitive drbd1 ocf:heartbeat:drbd \ > params drbd_resource="drbd1" \ > op monitor interval="29s" role="Master" timeout="30s" \ > op monitor interval="30s" role="Slave" timeout="30s" \ > op start interval="0" timeout="30s" start-delay="10s" > primitive drbd2 ocf:heartbeat:drbd \ > params drbd_resource="drbd2" \ > op monitor interval="29s" role="Master" timeout="30s" \ > op monitor interval="30s" role="Slave" timeout="30s" \ > op start interval="0" timeout="30s" start-delay="10s" > primitive R_pingd ocf:pacemaker:pingd > primitive R_Failover_Alert_Init ocf:heartbeat:MailTo2 \ > params sender="[email protected]" \ > params email="[email protected],[email protected]" \ > params subject="ACS Init" > primitive R_Failover_Alert_Target ocf:heartbeat:MailTo2 \ > params sender="[email protected]" \ > params email="[email protected],[email protected]" \ > params subject="ACS San" > group G_Target R_IP_Target R_tgtd \ > meta target-role="Started" > group G_Init R_IP_Init R_iscsi R_LVM R_Filesystem R_NFS \ > meta target-role="Stopped" > ms ms-drbd0 drbd0 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-drbd1 drbd1 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > ms ms-drbd2 drbd2 \ > meta clone-max="2" notify="true" globally-unique="false" > target-role="Started" > clone pingd R_pingd \ > meta target-role="Started" > clone Failover_Alert_Init R_Failover_Alert_Init \ > meta clone-max="2" target-role="Stopped" > clone Failover_Alert_Target R_Failover_Alert_Target \ > meta clone-max="2" target-role="Stopped" > location pingd-node-1 pingd 500: init1.mydomain.com > location pingd-node-2 pingd 500: init2.mydomain.com > location pingd-node-3 pingd 500: san1.mydomain.com > location pingd-node-4 pingd 500: san2.mydomain.com > location ms-drbd0-pref-1 ms-drbd0 200: san1.mydomain.com > location ms-drbd0-pref-2 ms-drbd0 100: san2.mydomain.com > location ms-drbd1-pref-1 ms-drbd1 200: san1.mydomain.com > location ms-drbd1-pref-2 ms-drbd1 100: san2.mydomain.com > location ms-drbd2-pref-1 ms-drbd2 200: san1.mydomain.com > location ms-drbd2-pref-2 ms-drbd2 100: san2.mydomain.com > location G_Target-pref-1 G_Target 200: san1.mydomain.com > location G_Target-pref-2 G_Target 100: san2.mydomain.com > location G_Init-pref-1 G_Init 200: init1.mydomain.com > location G_Init-pref-2 G_Init 100: init2.mydomain.com > location Failover-Alert-node1 Failover_Alert_Init 200: init1.mydomain.com > location Failover-Alert-node2 Failover_Alert_Init 100: init2.mydomain.com > location Failover-Alert-node3 Failover_Alert_Target 200: san1.mydomain.com > location Failover-Alert-node4 Failover_Alert_Target 100: san2.mydomain.com > colocation G_Target-on-ms-drbd0 inf: G_Target ms-drbd0:Master > colocation G_Target-on-ms-drbd1 inf: G_Target ms-drbd1:Master > colocation G_Target-on-ms-drbd2 inf: G_Target ms-drbd2:Master > order ms-drbd0-before-ms-drbd1 inf: ms-drbd0:promote ms-drbd1:promote > order ms-drbd1-before-ms-drbd2 inf: ms-drbd1:promote ms-drbd2:promote > order ms-drbd2-before-G_Target inf: ms-drbd2:promote G_Target:start > order G_Target-before-G_Init inf: G_Target:start G_Init:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9" \ > stonith-enabled="false" \ > stonith-action="reboot" \ > stop-orphan-resources="true" \ > stop-orphan-actions="true" \ > symmetric-cluster="false" \ > last-lrm-refresh="1239899583" \ > default-resource-stickiness="INFINITY" > > Any ideas? > -- > View this message in context: > http://www.nabble.com/DRBD-does-not-switch-resources-to-other-node-properly-tp23082432p23082432.html > Sent from the Linux-HA mailing list archive at Nabble.com. > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
