On 2013-05-15 20:44, Andrew Widdersheim wrote:
> Sorry to bring up old issues but I am having the exact same problem as the 
> original poster. A simultaneous disconnect on my two node cluster causes the 
> resources to start to transition to the other node but mid flight the 
> transition is aborted and resources are started again on the original node 
> when the cluster realizes connectivity is same between the two nodes.
> 
> I have tried various dampen settings without having any luck. Seems like the 
> nodes report the outages at slightly different times which results in a 
> partial transition of resources instead of waiting to know the connectivity 
> of all of the nodes in the cluster before taking action which is what I would 
> have thought dampen would help solve.
> 

You have some logs for us?

> Ideally the cluster wouldn't start the transition if another cluster node is 
> having a connectivity issue as well and connectivity status is shared between 
> all cluster nodes. Find my configuration below. Let me know there is 
> something I can change to fix or if this behavior is expected.
> 
> primitive p_drbd ocf:linbit:drbd \
>         params drbd_resource="r1" \
>         op monitor interval="30s" role="Slave" \
>         op monitor interval="10s" role="Master"
> primitive p_fs ocf:heartbeat:Filesystem \
>         params device="/dev/drbd/by-res/r1" directory="/drbd/r1" 
> fstype="ext4" options="noatime" \
>         op start interval="0" timeout="60s" \
>         op stop interval="0" timeout="180s" \
>         op monitor interval="30s" timeout="40s"
> primitive p_mysql ocf:heartbeat:mysql \
>         params binary="/usr/libexec/mysqld" config="/drbd/r1/mysql/my.cnf" 
> datadir="/drbd/r1/mysql" \
>         op start interval="0" timeout="120s" \
>         op stop interval="0" timeout="120s" \
>         op monitor interval="30s" \
>         meta target-role="Started"
> primitive p_ping ocf:pacemaker:ping \
>         params host_list="192.168.5.1" dampen="30s" multiplier="1000" 
> debug="true" \
>         op start interval="0" timeout="60s" \
>         op stop interval="0" timeout="60s" \
>         op monitor interval="5s" timeout="10s"
> group g_mysql_group p_fs p_mysql \
>         meta target-role="Started"
> ms ms_drbd p_drbd \
>         meta notify="true" master-max="1" clone-max="2" target-role="Started"
> clone cl_ping p_ping
> location l_connected g_mysql \
>         rule $id="l_connected-rule" pingd: defined pingd
> colocation c_mysql_on_drbd inf: g_mysql ms_drbd:Master
> order o_drbd_before_mysql inf: ms_drbd:promote g_mysql:start
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.6-1.el6-8b6c6b9b6dc2627713f870850d20163fad4cc2a2" \
>         cluster-infrastructure="Heartbeat" \

Hmm ... you compiled your own Pacemaker version that supports Heartbeat
on RHEL6?

Best regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

>         no-quorum-policy="ignore" \
>         stonith-enabled="false" \
>         cluster-recheck-interval="5m" \
>         last-lrm-refresh="1368632470"
> rsc_defaults $id="rsc-options" \
>         migration-threshold="5" \
>         resource-stickiness="200"                                       
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 



Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to