[Linux-HA] Problem promoting Slave to Master

Fredrik Hudner Fri, 15 Mar 2013 02:24:12 -0700

Hi all,
I have a problem after I removed a node with the force command from my crm 
config.
Originally I had 2 nodes running HA cluster (corosync 1.4.1-7.el6, pacemaker 
1.1.7-6.el6)


Then I wanted to add a third node acting as quorum node, but was not able to 
get it to work (probably because I don't understand how to set it up).
So I removed the 3rd node, but had to use the force command as crm complained 
when I tried to remove it.

Now when I start up Pacemaker the resources doesn't look like they come up 
correctly

Online: [ testclu01 testclu02 ]

Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
     Masters: [ testclu01 ]
     Slaves: [ testclu02 ]
Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
     Started: [ tdtestclu01 tdtestclu02 ]
Resource Group: g_nfs
     p_lvm_nfs  (ocf::heartbeat:LVM):   Started testclu01
     p_fs_shared        (ocf::heartbeat:Filesystem):    Started testclu01
     p_fs_shared2       (ocf::heartbeat:Filesystem):    Started testclu01
     p_ip_nfs   (ocf::heartbeat:IPaddr2):       Started testclu01
Clone Set: cl_exportfs_root [p_exportfs_root]
     Started: [ testclu01 testclu02 ]

Failed actions:
    p_exportfs_root:0_monitor_30000 (node=testclu01, call=12, rc=7, 
status=complete): not running
    p_exportfs_root:1_monitor_30000 (node=testclu02, call=12, rc=7, 
status=complete): not running

The filesystems mount correctly on the master at this stage and can be written 
to.
When I stop the services on the master node for it to failover, it doesn't 
work.. Looses cluster-ip connectivity

Corosync.log from master after I stopped pacemaker on master node :  see 
attached file

Additional files (attached): crm-configure show
                                                          Corosync.conf
                                                          Global_common.conf


I'm not sure how to proceed to get it up in a fair state now
So if anyone could help me it would be much appreciated

Kind regards
/Fredrik Hudner

corosync.log
Description: corosync.log

corosync.conf
Description: corosync.conf

[?1034hnode tdtestclu01
node tdtestclu02
primitive p_drbd_nfs ocf:linbit:drbd \
        params drbd_resource="nfs" \
        op monitor interval="15" role="Master" \
        op monitor interval="30" role="Slave"
primitive p_exportfs_root ocf:heartbeat:exportfs \
        params fsid="0" directory="/export" options="rw,crossmnt" 
clientspec="10.240.0.0/255.255.0.0" \
        op monitor interval="30s"
primitive p_fs_shared ocf:heartbeat:Filesystem \
        params device="/dev/vg_nfs/lv_shared" directory="/export/shared" 
fstype="ext4" \
        op monitor interval="10s"
primitive p_fs_shared2 ocf:heartbeat:Filesystem \
        params device="/dev/vg_nfs/lv_shared2" directory="/export/shared2" 
fstype="ext4" \
        op monitor interval="10s"
primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
        params ip="10.240.64.20" cidr_netmask="24" \
        op monitor interval="30s"
primitive p_lsb_nfsserver lsb:nfs \
        op monitor interval="30s"
primitive p_lvm_nfs ocf:heartbeat:LVM \
        params volgrpname="vg_nfs" \
        op monitor interval="30s"
group g_nfs p_lvm_nfs p_fs_shared p_fs_shared2 p_ip_nfs
ms ms_drbd_nfs p_drbd_nfs \
        meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true" target-role="Started"
clone cl_exportfs_root p_exportfs_root
clone cl_lsb_nfsserver p_lsb_nfsserver
location drbd-fence-by-handler-nfs-ms_drbd_nfs ms_drbd_nfs \
        rule $id="drbd-fence-by-handler-nfs-rule-ms_drbd_nfs" $role="Master" 
-inf: #uname ne tdtestclu01
colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master
colocation c_nfs_on_root inf: g_nfs cl_exportfs_root
order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start
order o_root_before_nfs inf: cl_exportfs_root g_nfs:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        last-lrm-refresh="1363170760" \
        stonith-enabled="false" \
        no-quorum-policy="freeze" \
        maintenance-mode="false"
rsc_defaults $id="rsc-options" \
        resource-stickiness="200"

global_common.conf
Description: global_common.conf

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Problem promoting Slave to Master

Reply via email to