On Tue, 2009-12-15 at 17:37 -0700, hj lee wrote: > Hi, > > Thanks for the response. The default token is 5000 that comes with > openais-0.80.5. 5000 seems to big! Here are the default parameters > that comes with openais-0.80.5 downloaded from > http://download.opensuse.org/repositories/server:/ha-clustering/CentOS_5/i386/. > I want to detect the node power fail within 500 msec. A small percent of > detection error is OK. Do you think it's possible to detect it within 500msec? >
500 msec should work but depends on your environment. More details below. The timers are close to those specified (500msec+the tick time of your kernel). Most kernels have a tick time of 10 msec or less. (The HZ kernel variable). I am not sure about tickless kernels. I regularly run corosync with 200 msec token timeout. It works very well as long corosync is regularly scheduled. realtime processes and kernel misbehavior (kernel spinning in a loop with spinlock held) can prevent corosync from being scheduled in a timely fashion, so it all depends on your deployment environment. The kernel option CONFIG_PREEMPT helps since it will cause corosync to preempt other long running processes in the system. Regards -steve > Thanks > hj > > ----------- default parameters comes with openais-0.80.5-------------- > # How long before declaring a token lost (ms) > token: 5000 > > # How many token retransmits before forming a new > configuration > token_retransmits_before_loss_const: 10 > > # How long to wait for join messages in the membership > protocol (ms) > join: 1000 > > # How long to wait for consensus to be achieved before > starting a new round of membership configuration (ms) > consensus: 2500 > > # Turn off the virtual synchrony filter > vsftype: none > > # Number of messages that may be sent by one processor on > receipt of the token > max_messages: 20 > > # Stagger sending the node join messages by 1..send_join ms > send_join: 45 > > > > On Tue, Dec 15, 2009 at 3:37 PM, Steven Dake <[email protected]> wrote: > > On Tue, 2009-12-15 at 14:49 -0700, hj lee wrote: > > Hi, > > > > I have a simple two nodes clusters with pacemaker-1.0.5 and > > openais-0.80.5. If I remove power cable immediately at one > of two > > nodes, how long does the other node take to detect one of > node went > > away? With the default openais parameters comes with > openais-0.80.5, > > it takes 5 sec from power off to promote(I have a simple > multi-state > > clone running). I want to shorten this time as much as I > can, what is > > the best way to shorten this time? What kind parameter I can > adjust to > > reduce this time? > > > > > The "token" parameter controls how long it takes to detect a > failure > within the network. It should default to 1000msec (1 second) > but can be > run at lower values depending on your system. > > I am not certain why you see 5 seconds for full recovery with > pacemaker. > Andrew may be able to address. > > Regards > -steve > > Thanks > > hj > > _______________________________________________ > > Openais mailing list > > [email protected] > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > -- > Dream with longterm vision! > kerdosa _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
