On 08/03/2011 10:32 PM, Tim Beale wrote: > Hi, > > It looks to me that the way the transition from Recovery to Operational works, > we can't guarantee that all nodes in the ring have entered Operational before > a node processes another Memb-Join message from a new node. E.g. we can't > guarantee the token has rotated right the way around the ring. > > When this happens, the nodes still in Recovery will still use the older ring > ID. So they won't get added to the transitional membership, and CLM will > report > leave events for these nodes. (Plus there might be other side-effects, like > the > FAILED TO RECEIVE problem - I haven't quite worked out why that's happening). >
Thanks for the pointer here - patch on ml. > We are currently using CLM to check the health of a node, i.e. so we can > detect > if it locks up. My questions are: > i) Are there config settings we could change to improve this, like increasing > the 'join' timeout? > ii) Should I try to make a code change to fix the problem? E.g. delay > processing the Memb-Join message if the node's only just entered operational. > iii) Should we not be using CLM like this? I.e. should we just learn to live > with CLM/CPG sometimes reporting nodes as leaving when they're perfectly > healthy. > > Thanks for your help. > Tim > Tim please try the patch I have recently posted: [PATCH] Set my_new_memb_list in recovery enter First and foremost, let me know if it resolves your 10 node startup case which fails 10% of the time. Then let me know if it treats other symptoms. Regards -steve > On Wed, Aug 3, 2011 at 3:28 PM, Tim Beale <[email protected]> wrote: >> Hi, >> >> We're booting up a 10-node cluster (with all nodes starting corosync at >> roughly >> the same time) and approx 1 in 10 times we see some problems: >> a) CLM is reporting nodes as leaving and then immediately rejoining (not sure >> if this is valid behaviour?) >> b) Probably an unrelated oddity, but we're getting flow control enabled on a >> client daemon using CLM that's only sending one request >> (saClmClusterTrack()). >> c) A node is hitting the FAILED TO RECEIVE case >> d) After c) there seems to be a lot of churn as the cluster tries to reform >> e) During the processing of node leave events, the CPG client can sometimes >> get >> broken so it no longer processes *any* CPG events >> >> Corosync debug is attached (I commented out some of the noisier debug around >> message delivery). We don't really know enough about corosync to tell what >> exactly is incorrect behaviour and what should be fixed. But here's what >> we've >> noticed: >> 1). Node-4 joins soon after node-1. When this happens all nodes except >> node-12 >> have entered operational state (see node-12.txt line 235). It looks like >> maybe >> node-12 hasn't received enough rotations of the token to enter operational >> yet. >> Node-12's resulting transitional config consists of just itself. All nodes >> then >> report node-1 and node-12 as leaving and immediately rejoining. >> 2) After this config change, node-3 eventually hits the FAILED TO RECEIVE >> case >> (node-3.txt line 380). At this point node-1 and node-12 have an ARU matching >> the high_seq_received, all other nodes have an ARU of zero. >> 3) Node-3 entering gather seems to result in a lot of config change churn >> across the cluster. >> 4) While processing the config changes on node-3, the CPG downlist it uses >> contains itself. When node-3 sends leave events for the nodes in the downlist >> (including itself), it sets its own cpd state to CPD_STATE_UNJOINED and >> clears >> the cpd->group_name. This means it no longer sends any CPG events to the CPG >> client. >> >> We tried cherry-picking this commit to fix the problem (#4) with the CPG >> client. >> http://www.corosync.org/git/?p=corosync.git;a=commit;h=956a1dcb4236acbba37c07e2ac0b6c9ffcb32577 >> It helped a bit, but didn't fix it completely. We've made an interim change >> (attached) to avoid this problem. >> >> We're using corosync v1.3.1 on an embedded linux system (with a low-spec >> CPU). >> Corosync is running over a basic ethernet interface (no hubs/routers/etc). >> >> Any help would be appreciated. Let me know if there's any other debug I can >> provide. >> >> Thanks, >> Tim >> > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
