On Sat, Nov 6, 2010 at 10:40 AM, Pavlos Parissis <[email protected]>wrote:
> On 5 November 2010 20:32, mike <[email protected]> wrote: > > Hi all, > > > > I'm running a simple MySQL cluster on a very heavily loaded LPAR and > > experiencing some outages due to late heartbeat packets, Gmain timeouts > > and so on. > > Before we look at the settings, do you know if keepalives are lost due > to load on the network(NIC and/or switch) or due to a load on the > system? > > > I'd like to adjust these settings: > > > > # Thresholds (in seconds) > > keepalive 1 > > warntime 6 > > deadtime 10 > > initdead 15 > > > > I'm thnking I'd like to make it this: > > > > # Thresholds (in seconds) > > keepalive 60 > > warntime 60 > > deadtime 120 > > initdead 240 > > > > Anyone see a problem with these settings? > I had previouly that high settings on one site - the problem was when a host restarted ( for any reason - like press reset button, otr kernel just reboots), Then after startup all resources stay in stopped mode and need to be started manually. From logs another node was detecting a faliure but because of deadtime of 120 was not started resources ans failed node was back online in about 40sec. -- -- Michael _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
