On Sat, Nov 6, 2010 at 10:40 AM, Pavlos Parissis
<[email protected]>wrote:

> On 5 November 2010 20:32, mike <[email protected]> wrote:
> > Hi all,
> >
> > I'm running a simple MySQL cluster on a very heavily loaded LPAR and
> > experiencing some outages due to late heartbeat packets, Gmain timeouts
> > and so on.
>
> Before we look at the settings, do you know if keepalives are lost due
> to load on the network(NIC and/or switch) or due to a load on the
> system?
>
> > I'd like to adjust these settings:
> >
> > # Thresholds (in seconds)
> >  keepalive                      1
> >  warntime                       6
> >  deadtime                       10
> >  initdead                       15
> >
> > I'm thnking I'd like to make it this:
> >
> > # Thresholds (in seconds)
> >  keepalive                      60
> >  warntime                       60
> >  deadtime                       120
> >  initdead                       240
> >
> > Anyone see a problem with these settings?
>

I had previouly that high settings on one site - the problem was when a host
restarted ( for any reason - like press reset button, otr kernel just
reboots),
Then after startup all resources stay in stopped mode and need to be started
manually. From logs another node was detecting a faliure but because of
deadtime of 120 was not started resources ans failed node was back online in
about 40sec.


-- 
--
Michael
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to