On 2008-04-08T19:32:58, Bernd Schubert <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> I need to set a rather huge dead time of 1200s, but the initial dead time is 
> supposed to be of 120s or less. However, heartbeat tries to be 
> schoolmasterly and doesn't want to accept my settings:
> 
> deadtime 1200 # time to declare a node dead
> initdead 120  # time to declare a node dead on heartbeat startup
> keepalive 120 # how often to send keepalive packets

Algorithmic reasons require that initdead be larger than deadtime.

keepalive every two minutes and deadtime at 20 minutes is exceptional.

Not even Lustre should create a load so high that a realtime priority
thread which is entirely locked into memory is not reliably scheduled
for 20 minutes at a stretch!

(I'm not quite sure I'd consider that "HA" ... ;-)

This needs to be fixed within Lustre.

> Well, heartbeat is not startup up automatically here and even the nodes are 
> not powered on automatically after a hard reset. So when I start heartbeat 
> I'm activeley monitoring everything and there is absolutely no need to let me 
> wait at least 20min on start up. I'm even not convinced a deadtime of 20min 
> is sufficient, since this is for a Lustre cluster and Lustre sometimes 
> manages to create such a high load that nothing else than the Lustre and 
> related kernel threads do work on the system...

A deadtime of 20m is not sufficient, but you worry about 20m on startup?

You're quite aware that deadtime is the time you should expect to be w/o
service in case one node crashes, right?


Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to