Previous default (50) was too low for most modern switch hardware. This may trigger abort because the aru doesn't increase for 50 token rotations combined with a defect in how failed to recv conditions are handled. By increasing this tunable, the condition should no longer trigger the errant code.
Signed-off-by: Jan Friesse <[email protected]> --- exec/totemconfig.c | 2 +- man/corosync.conf.5 | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/exec/totemconfig.c b/exec/totemconfig.c index 5135672..80ca182 100644 --- a/exec/totemconfig.c +++ b/exec/totemconfig.c @@ -73,7 +73,7 @@ #define JOIN_TIMEOUT 50 #define MERGE_TIMEOUT 200 #define DOWNCHECK_TIMEOUT 1000 -#define FAIL_TO_RECV_CONST 50 +#define FAIL_TO_RECV_CONST 2500 #define SEQNO_UNCHANGED_CONST 30 #define MINIMUM_TIMEOUT (int)(1000/HZ)*3 #define MAX_NETWORK_DELAY 50 diff --git a/man/corosync.conf.5 b/man/corosync.conf.5 index d092064..3f8e90e 100644 --- a/man/corosync.conf.5 +++ b/man/corosync.conf.5 @@ -380,7 +380,7 @@ This constant specifies how many rotations of the token without receiving any of the messages when messages should be received may occur before a new configuration is formed. -The default is 50 failures to receive a message. +The default is 2500 failures to receive a message. .TP seqno_unchanged_const -- 1.7.1 _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
