[Linux-HA] Who takes care of the failover?

RaSca Wed, 25 Nov 2009 00:00:52 -0800

Hi everybody,
I'm trying to do some tests with heartbeat and pacemaker (with 
ubuntu-server 9.10, heartbeat 2.99.2+sles11r9-5ubuntu1 and 
pacemaker-heartbeat 1.0.5+hg20090813-0ubuntu4) this is my configuration:


node $id="2ee6e25d-8bd6-42ba-a2c5-6bc98b6f4715" nas-1 \
        attributes standby="off"
node $id="4ea6f84c-841a-4272-903c-e14ad4baefe4" nas-2 \
        attributes standby="off"
primitive drbd0 ocf:linbit:drbd \
        params drbd_resource="r0" \
        op monitor interval="15s" \
        meta target-role="Started"
primitive drbd1 ocf:linbit:drbd \
        params drbd_resource="r1" \
        op monitor interval="15s" \
        meta target-role="Started"
primitive fs_hafs ocf:heartbeat:Filesystem \
        params device="/dev/drbd0" directory="/hafs" fstype="ext3" \
        meta target-role="Started"
primitive fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/drbd1" directory="/mysql" fstype="ext3" \
        meta target-role="Started"
primitive ip_hafs ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.80" nic="eth0"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.81" nic="eth0"
primitive mysql-server lsb:mysql
group hafs fs_hafs ip_hafs
group mysql fs_mysql ip_mysql mysql-server
ms ms_drbd0 drbd0 \
        meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true"
ms ms_drbd1 drbd1 \
        meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true"
location cli-prefer-hafs hafs \
        rule $id="cli-prefer-rule-hafs" inf: #uname eq nas-1
location cli-prefer-mysql mysql \
        rule $id="cli-prefer-rule-mysql" inf: #uname eq nas-2
colocation hafs_on_drbd inf: hafs ms_drbd0:Master
colocation mysql_on_drbd inf: mysql ms_drbd1:Master
order hafs_after_drbd inf: ms_drbd0:promote hafs:start
order mysql_after_drbd inf: ms_drbd1:promote mysql:start
property $id="cib-bootstrap-options" \
        dc-version="1.0.5-3840e6b5a305ccb803d29b468556739e75532d56" \
        cluster-infrastructure="Heartbeat" \
        stonith-enabled="false" \
        last-lrm-refresh="1259075046"

the two machines are connected with eth0 on the lan and one to each 
othere with a double cross cable with a bond interface named bond1 that 
works in balance-rr mode.
This is the heartbeat configuration:

crm on

use_logd no
debugfile /var/log/ha.debug
logfile /var/log/ha.log
logfacility     local0

keepalive 2
deadtime 10
warntime 5
initdead 15

ucast eth0 192.168.1.79
ucast eth0 192.168.1.77
ucast bond1 10.0.0.1
ucast bond1 10.0.0.2

auto_failback on

node    nas-1
node    nas-2

ping_group lan gateway pdc

Everything works fine (i can move, migrate resources and put a node in 
standby mode) until i force a node to be faulty. I mean something like 
removing the ethernet cable. What I can't understand is why, even if the 
log shows the cable failure, the crm does not move any resource.
For example, if I'm in this situation:

Master/Slave Set: ms_drbd0
         Masters: [ nas-1 ]
         Slaves: [ nas-2 ]
Resource Group: hafs
     fs_hafs     (ocf::heartbeat:Filesystem):    Started nas-1
     ip_hafs     (ocf::heartbeat:IPaddr2):       Started nas-1
Master/Slave Set: ms_drbd1
         Masters: [ nas-2 ]
         Slaves: [ nas-1 ]
Resource Group: mysql
     fs_mysql    (ocf::heartbeat:Filesystem):    Started nas-2
     ip_mysql    (ocf::heartbeat:IPaddr2):       Started nas-2
     mysql-server        (lsb:mysql):    Started nas-2

and then I remove the ethernet cable of the nas-2 node, from the log i 
see this message:

Nov 24 17:41:45 nas-1 heartbeat: [1489]: info: Link nas-2:eth0 dead.

but any other action is taken by the crm or hertbeat itself...

What's wrong with my thoughts? What am I ignoring?

Thanks for your help.

-- 
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
[email protected]
http://www.miamammausalinux.org

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Who takes care of the failover?

Reply via email to