Hello On Wed, 2007-09-26 at 23:07 +0200, Andrew Beekhof wrote:
> On 9/26/07, Dave Augustus <[EMAIL PROTECTED]> wrote: > > Hello All, > > > > Thanks for your help up to this point. We now have a 6 node cluster > > running in test mode. The DC is r6 and the load balancer resource group > > is NOT running but my 5 clones are. So I started updating my cib and I > > found that I couldn't. > > I got this message instead: > > > > "No messages received in 30 seconds.. aborting" > > > > Looking at the logs I found that it is just filling up with entries like > > this: > > > > How can I get control of my cluster from this error ? > > Dave > > usually there is a firewall or some other comms-related failure > involved - try starting there > There was no firewall- these machines are on their own LAN segment- nothing in the middle but a switch. I could ssh into each server and ended up stopping heartbeat on the troublesome host. I restarted heartbeat and the problem never reappeared. It was scary because the machine that could not be reached was the DC and so management of the cluster was hosed until I intervened. Not a pretty sight!!! Thanks, Dave _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
