For two node clusters there's a convenient workaround: crossover cable.
You'll need a spare Ethernet port but that's easier than getting certain switches to do multicast correctly. (At least in my experience.) From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Thursday, April 15, 2010 1:44 PM To: linux clustering Cc: [email protected]; [email protected] Subject: Re: [Linux-cluster] Two node cluster,start CMAN fence the other node Most likely the multicast packet communication between the 2 nodes is not getting through your network. [email protected] wrote on 04/15/2010 01:05:01 PM: > Good afternoon, > I'm trying to form my first cluster of two nodes, using iLO fence > devices. I need some help because I can't find what I've missed. > My main problem is that the "service cman start" reboots the other > node and I can't form the two nodes cluster. > I'm using (at both nodea and nodeb, they are on the same VLAN and > pings each other ok): > > [r...@nodea ~]# uname -a > Linux nodea 2.6.18-164.15.1.el5 #1 SMP Wed Mar 17 11:30:06 EDT 2010 > x86_64 x86_64 x86_64 GNU/Linux > [r...@nodea ~]# rpm -qa |grep cman > cman-2.0.115-1.el5_4.9 > > [r...@nodea ~]# cat /etc/cluster/cluster.conf (nodeb has the same file) > <?xml version="1.0" ?> > <cluster alias="VCluster" config_version="5" name="VCluster"> > <fence_daemon post_fail_delay="0" post_join_delay="25"/> > <clusternodes> > <clusternode name="nodea" nodeid="1" votes="1"> > <fence> > <method name="1"> > <device name="nodeaILO"/> > </method> > </fence> > </clusternode> > <clusternode name="nodeb" nodeid="2" votes="1"> > <fence> > <method name="1"> > <device name="nodebILO"/> > </method> > </fence> > </clusternode> > </clusternodes> > <cman expected_votes="1" two_node="1"/> > <fencedevices> > <fencedevice agent="fence_ilo" hostname="nodeacn" > login="user" name="nodeaILO" passwd="hp"/> > <fencedevice agent="fence_ilo" hostname="nodebcn" > login="user" name="nodebILO" passwd="hp"/> > </fencedevices> > <rm> > <failoverdomains/> > <resources/> > </rm> > </cluster> > > When I start the cman service, it hangs up for some time at the > "Starting fencing..." step and after those configured 25secs it > fences nodeb and reboots it. > [r...@nodea ~]# service cman start > Starting cluster: > Loading modules... done > Mounting configfs... done > Starting ccsd... done > Starting cman... done > Starting daemons... done > Starting fencing... done > [ OK ] > > "nodeb" gets rebooted: > [r...@nodeb ~]# > Broadcast message from root (Thu Apr 15 18:42:24 2010): > > The system is going down for system halt NOW! > > At the syslog I just can find: > Apr 15 18:40:59 nodea ccsd[16930]: Initial status:: Quorate > Apr 15 18:40:59 nodea openais[16936]: [CLM ] Members Left: > Apr 15 18:40:59 nodea openais[16936]: [CLM ] Members Joined: > Apr 15 18:40:59 nodea openais[16936]: [CLM ] CLM CONFIGURATION CHANGE > Apr 15 18:41:00 nodea openais[16936]: [CLM ] New Configuration: > Apr 15 18:41:00 nodea openais[16936]: [CLM ] r(0) ip(10.192.16.42) > Apr 15 18:41:00 nodea openais[16936]: [CLM ] Members Left: > Apr 15 18:41:00 nodea openais[16936]: [CLM ] Members Joined: > Apr 15 18:41:00 nodea openais[16936]: [CLM ] r(0) ip(10.192.16.42) > Apr 15 18:41:00 nodea openais[16936]: [SYNC ] This node is within > the primary component and will provide service. > Apr 15 18:41:00 nodea openais[16936]: [TOTEM] entering OPERATIONAL state. > Apr 15 18:41:00 nodea openais[16936]: [CMAN ] quorum regained, > resuming activity > Apr 15 18:41:00 nodea openais[16936]: [CLM ] got nodejoin message > 10.192.16.42 > Apr 15 18:42:11 nodea fenced[16955]: nodeb not a cluster member > after 25 sec post_join_delay > Apr 15 18:42:11 nodea fenced[16955]: fencing node "nodeb" > Apr 15 18:42:23 nodea fenced[16955]: fence "nodeb" success > > [r...@nodea ~]# clustat > Cluster Status for VCluster @ Thu Apr 15 18:55:23 2010 > Member Status: Quorate > > Member Name ID Status > ------ ---- ---- ------ > nodea > 1 Online, Local > nodeb 2 Offline > > Then when nodeb starts again, I try to start cman there to join the > cluster... but it again fences "nodea": > [r...@nodeb ~]# clustat > Could not connect to CMAN: No such file or directory > [r...@nodeb ~]# service cman start > Starting cluster: > Loading modules... done > Mounting configfs... done > Starting ccsd... done > Starting cman... done > Starting qdiskd... done > Starting daemons... done > Starting fencing... (wait for 25secs again) done > [ OK ] > "nodea" gets rebooted: > [r...@nodea ~]# > Broadcast message from root (Thu Apr 15 18:58:40 2010): > > The system is going down for system halt NOW! > > Apr 15 18:57:31 nodeb openais[11789]: [CLM ] Members Joined: > Apr 15 18:57:31 nodeb openais[11789]: [CLM ] r(0) ip(10.192.16.44) > Apr 15 18:57:31 nodeb openais[11789]: [SYNC ] This node is within > the primary component and will provide service. > Apr 15 18:57:31 nodeb openais[11789]: [TOTEM] entering OPERATIONAL state. > Apr 15 18:57:31 nodeb openais[11789]: [CMAN ] quorum regained, > resuming activity > Apr 15 18:57:31 nodeb openais[11789]: [CLM ] got nodejoin message > 10.192.16.44 > Apr 15 18:57:34 nodeb qdiskd[10323]: <info> Quorum Daemon Initializing > Apr 15 18:57:34 nodeb qdiskd[10323]: <crit> Initialization failed > Apr 15 18:58:42 nodeb fenced[11816]: nodea not a cluster member > after 25 sec post_join_delay > Apr 15 18:58:42 nodeb fenced[11816]: fencing node "nodea" > Apr 15 18:58:54 nodeb fenced[11816]: fence "nodea" success > > And I can't get the two nodes, joining the cluster... > I guess I'm missing something at the cluster.conf file??? I can't > find what I'm making wrong. > > Thanks for any help! > > Alex Re-- > Linux-cluster mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
