Digimer, I have applied the changes but looks like it goes into fence loop. That means when node 1 is running cman and when reboot node2, it fences node1 and they get into a loop
1) On both nodes acpid is off krplporcl001 ~]# service acpid status acpid is stopped krplporcl002 ~]# service acpid status acpid is stopped 2) Changes in cluster .conf < <clusternode name= "*krplporcl001"* nodeid="1" > <fence> <method name = "1"> <device lanplus = "" name="inspuripmi" *delay ="15*" action ="reboot"/> </method> </fence> </clusternode> <clusternode name = "*krplporcl002*" nodeid="2"> <fence> 3) Bonding uses mode = 1 only on krplporcl001 : *DEVICE=bond0* *IPADDR=192.168.10.10* *NETMASK=255.255.255.0* *NETWORK=192.168.10.0* *BROADCAST=192.168.10.255* *BOOTPROTO=none* *Type=Ethernet* *ONBOOT=yes* *BONDING_OPTS='miimon=100 mode=1'* on krplporcl002 *DEVICE=bond0* *IPADDR=192.168.10.11* *NETMASK=255.255.255.0* *NETWORK=192.168.10.0* *BROADCAST=192.168.10.255* *BOOTPROTO=none* *Type=Ethernet* *ONBOOT=yes* *BONDING_OPTS='miimon=100 mode=1'* ~ 4) I have put one switch as sivaji suggested As soon as The logs on klrplporcl001 are as follows Sep 10 11:47:53 krplporcl001 fenced[5977]: fencing node krplporcl002 The logs on krplporcl002 are as follows : Sep 10 11:46:48 krplporcl002 fenced[2950]: fencing node krplporcl001 I am not sure why the network is breaking and why both nodes can not communicate with each other? Any places to look for logs etc? On Wed, Sep 10, 2014 at 11:28 AM, Amjad Syed <amjad...@gmail.com> wrote: > > > On Tue, Sep 9, 2014 at 11:53 AM, Digimer <li...@alteeve.ca> wrote: > >> On 09/09/14 03:14 AM, Amjad Syed wrote: >> >>> <device lanplus = "" name="inspuripmi" action ="reboot"/> >>> >> >> Something is breaking the network during the shutdown, a fence is being >> called and both nodes are killing the other, causing a dual fence. So you >> have a set of problems, I think. >> >> First, disable acpid on both nodes. >> >> Second, change the quoted line (only) to: >> >> <device lanplus = "" name="inspuripmi" delay="15" action ="reboot"/> >> >> If I am right, this will mean that 192.168.10.10 will stay up (fence) .11 >> >> Third, what bonding mode are you using? I would only use mode=1. >> >> Forth, please set the node names to match 'uname -n' on both nodes. Be >> sure the names translate to the IPs you want (via /etc/hosts, ideally). >> >> Fifth, as Sivaji suggested, please put switch(es) between the nodes. >> >> If it still tries to fence when a node shuts down (watch >> /var/log/messages and look for 'fencing node ...'), please paste your logs >> from both nodes. >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >> access to education? >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster