Hi Alan Thankyou for the input, but that is not it: I have double checked the config. This log is from the slave which had it's network connection disconneted. It says that nw is daed which seems correct to me, since NIC is disconnected? (nw i the router and 2 DNS servers on the local network)
It seems the log got mangled in the mail, maybe you overlooked the [RECONNECT] line. I'll try to include it again. Morten [DISCONNECT] Nov 7 10:11:44 localhost heartbeat[4421]: WARN: node nw: is dead Nov 7 10:11:44 localhost heartbeat[4421]: info: Link nw:nw dead. Nov 7 10:11:44 localhost ipfail[4431]: info: Status update: Node nw now has status dead Nov 7 10:11:44 localhost heartbeat[4631]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL Nov 7 10:11:44 localhost ipfail[4431]: info: NS: We are dead. :< Nov 7 10:11:44 localhost ipfail[4431]: info: Link Status update: Link nw/nw now has status dead Nov 7 10:11:44 localhost ipfail[4431]: info: We are dead. :< Nov 7 10:11:44 localhost ipfail[4431]: info: Asking other side for ping node count. Nov 7 10:11:44 localhost ipfail[4431]: debug: Message [num_ping] sent. Nov 7 10:11:44 localhost heartbeat: info: Running /etc/ha.d/rc.d/status status [RECONNECT] Nov 7 10:12:07 localhost heartbeat[4421]: info: Link nw:nw up. Nov 7 10:12:07 localhost heartbeat[4421]: WARN: Late heartbeat: Node nw: interval 35020 ms Nov 7 10:12:07 localhost heartbeat[4421]: info: Status update for node nw: status ping Nov 7 10:12:07 localhost ipfail[4431]: info: Link Status update: Link nw/nw now has status up Nov 7 10:12:07 localhost ipfail[4431]: info: Status update: Node nw now has status ping Nov 7 10:12:07 localhost ipfail[4431]: info: A ping node just came up. Nov 7 10:12:07 localhost ipfail[4431]: debug: Found ping node nw! Nov 7 10:12:07 localhost ipfail[4431]: info: Asking other side for ping node count. Nov 7 10:12:07 localhost ipfail[4431]: debug: Message [num_ping] sent. > -----Original Message----- > From: Alan Robertson [mailto:[EMAIL PROTECTED] > Sent: 8. november 2007 16:14 > To: Morten Laursen > Cc: [email protected] > Subject: Re: Missing gratious ARP > > Morten Laursen wrote: > >> On 1.2.3, that "returning after partition" should cause > both sides to > >> shut down all resources and restart them. Restarting them should > >> issue more gratuitous ARPs. > >> > >> Do both servers get the "returning after partition" message? > > > > No, the slave does not get the message, and it does not > restart. Here is the entire log from the slave: > > > > [DISCONNECT] > > Nov 7 10:11:44 localhost heartbeat[4421]: WARN: node nw: > is dead Nov > > 7 10:11:44 localhost heartbeat[4421]: info: Link nw:nw dead. > > Nov 7 10:11:44 localhost ipfail[4431]: info: Status > update: Node nw > > now has sta tus dead Nov 7 10:11:44 localhost > heartbeat[4631]: debug: > > notify_world: setting SIGCHLD Handler to SIG_DFL Nov 7 10:11:44 > > localhost ipfail[4431]: info: NS: We are dead. :< Nov 7 10:11:44 > > localhost ipfail[4431]: info: Link Status update: Link > nw/nw now has > > status dead Nov 7 10:11:44 localhost ipfail[4431]: info: > We are dead. > > :< Nov 7 10:11:44 localhost ipfail[4431]: info: Asking > other side for > > ping node co unt. > > Nov 7 10:11:44 localhost ipfail[4431]: debug: Message > [num_ping] sent. > > Nov 7 10:11:44 localhost heartbeat: info: Running > > /etc/ha.d/rc.d/status status [RECONNECT] Nov 7 10:12:07 localhost > > heartbeat[4421]: info: Link nw:nw up. > > Nov 7 10:12:07 localhost heartbeat[4421]: WARN: Late > heartbeat: Node > > nw: interv al 35020 ms Nov 7 10:12:07 localhost heartbeat[4421]: > > info: Status update for node nw: stat us ping Nov 7 10:12:07 > > localhost ipfail[4431]: info: Link Status update: Link > nw/nw now has > > status up Nov 7 10:12:07 localhost ipfail[4431]: info: > Status update: > > Node nw now has sta tus ping Nov 7 10:12:07 localhost > ipfail[4431]: > > info: A ping node just came up. > > Nov 7 10:12:07 localhost ipfail[4431]: debug: Found ping node nw! > > Nov 7 10:12:07 localhost ipfail[4431]: info: Asking other side for > > ping node co unt. > > Nov 7 10:12:07 localhost ipfail[4431]: debug: Message > [num_ping] sent. > > > > These messages indicate to me that you have a screwed up > configuration. > > I would guess that you have 'nw' as both a node in your > cluster AND as a ping node. > > Never ping anything inside your cluster. > > > -- > Alan Robertson <[EMAIL PROTECTED]> > > "Openness is the foundation and preservative of friendship... > Let me claim from you at all times your undisguised > opinions." - William Wilberforce > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
