On Fri, 2010-03-12 at 09:44 +0100, Herwin Kleinjan wrote:
> Hello,
>
> Currently we are looking into possibilities to speed up the fail-over
> process on our dual node cluster. This is a RHEL 5.4 cluster running on HP
> Proliant servers with iLO based power fencing. For shared storage we use a
> fiber based storage array.
>
> There are some parts where fail-over time might be improved, one of them
> relating to openais or its configuration. During testing whenever one node
> is failing or its power is disconnected, the other node detects this and the
> fail-over process is started:
>
> Mar 12 08:46:02 donald01 openais[5301]: [TOTEM] The token was lost in the
> OPERATIONAL state.
> Mar 12 08:46:02 donald01 openais[5301]: [TOTEM] Receive multicast socket
> recv buffer size (288000 bytes).
> Mar 12 08:46:02 donald01 openais[5301]: [TOTEM] Transmit multicast socket
> send buffer size (288000 bytes).
> Mar 12 08:46:02 donald01 openais[5301]: [TOTEM] entering GATHER state from
> 2.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] entering GATHER state from
> 0.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] Creating commit token
> because I am the rep.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] Saving state aru 44 high seq
> received 44
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] Storing new sequence id for
> ring 74
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] entering COMMIT state.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] entering RECOVERY state.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] position [0] member
> 10.227.180.101:
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] previous ring seq 112 rep
> 10.227.180.101
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] aru 44 high delivered 44
> received flag 1
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] Did not need to originate
> any messages in recovery.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] Sending initial ORF token
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] CLM CONFIGURATION CHANGE
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] New Configuration:
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] r(0)
> ip(10.227.180.101)
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] Members Left:
> Mar 12 08:46:07 donald01 kernel: dlm: closing connection to node 2
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] r(0)
> ip(10.227.180.102)
> Mar 12 08:46:07 donald01 clurgmgrd[7525]: <info> State change: donald02 DOWN
>
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] Members Joined:
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] CLM CONFIGURATION CHANGE
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] New Configuration:
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] r(0)
> ip(10.227.180.101)
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] Members Left:
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] Members Joined:
> Mar 12 08:46:07 donald01 openais[5301]: [SYNC ] This node is within the
> primary component and will provide service.
> Mar 12 08:46:07 donald01 openais[5301]: [TOTEM] entering OPERATIONAL state.
> Mar 12 08:46:07 donald01 openais[5301]: [CLM ] got nodejoin message
> 10.227.180.101
> Mar 12 08:46:07 donald01 openais[5301]: [CPG ] got joinlist message from
> node 1
>
>
> As you can see from the above /var/log/messages excerpt there is a 5 second
> time frame at the beginning in which apparently nothing is happening
> (08:46:02-08:46:07). I was wondering how I could reduce or remove this delay
> so that the fail-over process will be done more quickly.
>
> My current /etc/ais/openais.conf is still the installed default:
> totem {
> version: 2
> secauth: off
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 192.168.2.0
> mcastaddr: 226.94.1.1
> mcastport: 5405
> }
> }
>
> logging {
> debug: off
> timestamp: on
> }
>
> amf {
> mode: disabled
> }
>
>
> From /etc/cluster/cluster.conf some lines relating to openais:
> <cman expected_votes="1" two_node="1" hello_timer="1" deadnode_timeout="3"/>
> <totem token="3000"/>
>
> I am suspecting more fine tuning of additional openais configuration
> parameters will do the trick but I am not sure. If more information is
> needed please let me know, any useful advice would be greatly appreciated!
>
> Best regards,
> Herwin
>
When a cluster is started with cman, /etc/ais/openais.conf is not used.
While changing timing parameters is not supported by Red Hat, they can
be modified via overrides. The 5 second time window you see is a result
of the "consensus" timeout parameter.
consensus should be at minimum 2* token.
You might try
<totem token="500" consensus="1500" retransmits_before_loss_const="8"/>
With qdisk, there may be other implications on timer settings.
You may have to override retransmits_before_loss_const as well. The
token timeout is divided by retrans_before_loss and used to calculate
the token_retransmit parameter. The smallest value for any timer can be
30 milliseconds because of limitations of the Linux timer
implementation.
I believe retransmits_before_loss_const is something like 20 for cman,
so in this case 500/20 = 25 msec (less then 30 msec) which will cause
aisexec to fail to start.
A safe value for retransmits_before_loss might be 8-10.
Again, not supported by Red Hat support, and YMMV.
Let us know how it goes.
Timer parameters need to be the same on all nodes.
Regards
-steve
>
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais