please find me replies in-line
On Fri, 2012-12-14 at 12:13 +0100, Fabian Herschel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> As you are using Multcast (MCAST)
>
yes
> - could it be the case that the
> switch/LAN dropped all Multicast packages for some time?
>
how can I verify or check if the switch drops Multicast packages for
some time ?
> As lot of switches which are managed are dropping MCAST by default (at least
> I got that feedback from customers) it could be that your switch was
> either reconfigured for a time period or there was a fireware update?
>
I asked the Network guys, and they assured me that they haven't
performed any activity(updates, configuration) even within last 10 days.
> Just my thoughts abou things happened at customer side.
> Fabian Herschel
>
thanks, appreciated.
> On 12/14/2012 06:31 AM, Muhammad Sharfuddin wrote:
> > node1(ailprd1) IP:192.168.7.11 node2(ailprd2) IP:192.168.7.12
> >
> > Its a two node active/passive cluster, running perfectly since last
> > two months, but yesterday both nodes were fenced(rebooted).
> > Network connectivity b/w both nodes is perfect, and cluster is
> > running fine again.
> >
> > Help me know the reason behind the following situation, and how can
> > I avoid it happening next time:
> >
> > on node1(active node): Dec 13 12:31:06 ailprd1 corosync[7274]:
> > [TOTEM ] A processor failed, forming new configuration. Dec 13
> > 12:31:12 ailprd1 corosync[7274]: [CLM ] CLM CONFIGURATION CHANGE
> > Dec 13 12:31:12 ailprd1 corosync[7274]: [CLM ] New Configuration:
> > Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ] r(0)
> > ip(192.168.7.11) Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ]
> > Members Left: Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ] r(0)
> > ip(192.168.7.12)
> >
> > on node2(passive node): Dec 13 12:31:05 ailprd2 corosync[7021]:
> > [TOTEM ] A processor failed, forming new configuration. Dec 13
> > 12:31:11 ailprd2 corosync[7021]: [CLM ] CLM CONFIGURATION CHANGE
> > Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] New Configuration:
> > Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] r(0)
> > ip(192.168.7.12) Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ]
> > Members Left: Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] r(0)
> > ip(192.168.7.11)
> >
> > for node1(ailprd1) node2 left, likewise node2(ailprd2) thinks that
> > node1 left. then node2 tries to start the resources which were
> > already running on node1, and both nodes were fenced.
> >
> > corosync.conf : totem { rrp_mode: none join: 60 max_messages:
> > 20
> > vsftype: none consensus: 6000 secauth: off
> > token_retransmits_before_loss_const: 10 token: 5000 version:
> > 2
> >
> > interface { bindnetaddr: 192.168.7.0 mcastaddr: 224.0.0.116
> > mcastport: 51234 ringnumber: 0 } clear_node_high_bit: yes }
> > logging
> > { to_logfile: no to_syslog: yes debug: off timestamp: off
> > to_stderr: no fileline: off syslog_facility: daemon
> >
> > }
> >
> > Regards, Muhammad Sharfuddin
> >
> >
> >
--
Regards,
Muhammad Sharfuddin
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems