On 2013-05-15 20:44, Andrew Widdersheim wrote: > Sorry to bring up old issues but I am having the exact same problem as the > original poster. A simultaneous disconnect on my two node cluster causes the > resources to start to transition to the other node but mid flight the > transition is aborted and resources are started again on the original node > when the cluster realizes connectivity is same between the two nodes. > > I have tried various dampen settings without having any luck. Seems like the > nodes report the outages at slightly different times which results in a > partial transition of resources instead of waiting to know the connectivity > of all of the nodes in the cluster before taking action which is what I would > have thought dampen would help solve. >
You have some logs for us? > Ideally the cluster wouldn't start the transition if another cluster node is > having a connectivity issue as well and connectivity status is shared between > all cluster nodes. Find my configuration below. Let me know there is > something I can change to fix or if this behavior is expected. > > primitive p_drbd ocf:linbit:drbd \ > params drbd_resource="r1" \ > op monitor interval="30s" role="Slave" \ > op monitor interval="10s" role="Master" > primitive p_fs ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/r1" directory="/drbd/r1" > fstype="ext4" options="noatime" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="180s" \ > op monitor interval="30s" timeout="40s" > primitive p_mysql ocf:heartbeat:mysql \ > params binary="/usr/libexec/mysqld" config="/drbd/r1/mysql/my.cnf" > datadir="/drbd/r1/mysql" \ > op start interval="0" timeout="120s" \ > op stop interval="0" timeout="120s" \ > op monitor interval="30s" \ > meta target-role="Started" > primitive p_ping ocf:pacemaker:ping \ > params host_list="192.168.5.1" dampen="30s" multiplier="1000" > debug="true" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" \ > op monitor interval="5s" timeout="10s" > group g_mysql_group p_fs p_mysql \ > meta target-role="Started" > ms ms_drbd p_drbd \ > meta notify="true" master-max="1" clone-max="2" target-role="Started" > clone cl_ping p_ping > location l_connected g_mysql \ > rule $id="l_connected-rule" pingd: defined pingd > colocation c_mysql_on_drbd inf: g_mysql ms_drbd:Master > order o_drbd_before_mysql inf: ms_drbd:promote g_mysql:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.6-1.el6-8b6c6b9b6dc2627713f870850d20163fad4cc2a2" \ > cluster-infrastructure="Heartbeat" \ Hmm ... you compiled your own Pacemaker version that supports Heartbeat on RHEL6? Best regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > no-quorum-policy="ignore" \ > stonith-enabled="false" \ > cluster-recheck-interval="5m" \ > last-lrm-refresh="1368632470" > rsc_defaults $id="rsc-options" \ > migration-threshold="5" \ > resource-stickiness="200" > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org