On Fri, 2012-12-14 at 16:47 +0500, Muhammad Sharfuddin wrote:
> please find me replies in-line
>
> On Fri, 2012-12-14 at 12:13 +0100, Fabian Herschel wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > As you are using Multcast (MCAST)
> >
> yes
>
so using Unicast instead of MCAST, would be a solution ?
> > - could it be the case that the
> > switch/LAN dropped all Multicast packages for some time?
> >
> how can I verify or check if the switch drops Multicast packages for
> some time ?
>
> > As lot of switches which are managed are dropping MCAST by default (at least
> > I got that feedback from customers) it could be that your switch was
> > either reconfigured for a time period or there was a fireware update?
> >
> I asked the Network guys, and they assured me that they haven't
> performed any activity(updates, configuration) even within last 10 days.
>
> > Just my thoughts abou things happened at customer side.
> > Fabian Herschel
> >
> thanks, appreciated.
>
> > On 12/14/2012 06:31 AM, Muhammad Sharfuddin wrote:
> > > node1(ailprd1) IP:192.168.7.11 node2(ailprd2) IP:192.168.7.12
> > >
> > > Its a two node active/passive cluster, running perfectly since last
> > > two months, but yesterday both nodes were fenced(rebooted).
> > > Network connectivity b/w both nodes is perfect, and cluster is
> > > running fine again.
> > >
> > > Help me know the reason behind the following situation, and how can
> > > I avoid it happening next time:
> > >
> > > on node1(active node): Dec 13 12:31:06 ailprd1 corosync[7274]:
> > > [TOTEM ] A processor failed, forming new configuration. Dec 13
> > > 12:31:12 ailprd1 corosync[7274]: [CLM ] CLM CONFIGURATION CHANGE
> > > Dec 13 12:31:12 ailprd1 corosync[7274]: [CLM ] New Configuration:
> > > Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ] r(0)
> > > ip(192.168.7.11) Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ]
> > > Members Left: Dec 13 12:31:13 ailprd1 corosync[7274]: [CLM ] r(0)
> > > ip(192.168.7.12)
> > >
> > > on node2(passive node): Dec 13 12:31:05 ailprd2 corosync[7021]:
> > > [TOTEM ] A processor failed, forming new configuration. Dec 13
> > > 12:31:11 ailprd2 corosync[7021]: [CLM ] CLM CONFIGURATION CHANGE
> > > Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] New Configuration:
> > > Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] r(0)
> > > ip(192.168.7.12) Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ]
> > > Members Left: Dec 13 12:31:11 ailprd2 corosync[7021]: [CLM ] r(0)
> > > ip(192.168.7.11)
> > >
> > > for node1(ailprd1) node2 left, likewise node2(ailprd2) thinks that
> > > node1 left. then node2 tries to start the resources which were
> > > already running on node1, and both nodes were fenced.
> > >
> > > corosync.conf : totem { rrp_mode: none join: 60 max_messages:
> > > 20
> > > vsftype: none consensus: 6000 secauth: off
> > > token_retransmits_before_loss_const: 10 token: 5000 version:
> > > 2
> > >
> > > interface { bindnetaddr: 192.168.7.0 mcastaddr: 224.0.0.116
> > > mcastport: 51234 ringnumber: 0 } clear_node_high_bit:
> > > yes } logging
> > > { to_logfile: no to_syslog: yes debug: off timestamp: off
> > > to_stderr: no fileline: off syslog_facility: daemon
> > >
> > > }
> > >
> > > Regards, Muhammad Sharfuddin
> > >
> > >
> > >
>
> --
> Regards,
>
> Muhammad Sharfuddin
>
>
--
Regards,
Muhammad Sharfuddin
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems