On 02/08/2011 05:54 PM, [email protected] wrote: > Hi Steven, > >> Have a try of the patch i have sent to this ml. If the issue persists, >> we can look at more options. > > Thank you for comment. > > Is your patch 2 of the next? > > * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the > token > > * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the > token(take 2) > > Best Regards, > Hideo Yamauchi. >
Yes the take 2 version. Regards -steve > > > --- Steven Dake <[email protected]> wrote: > >> On 02/07/2011 11:11 PM, [email protected] wrote: >>> Hi Steven, >>> >>> I understood your opinion by mistake. >>> >>> We do not have simple test case. >>> >>> The phenomenon generated in our environment is the following thing. >>> >>> Step 1) corosync constitutes a cluster in 12 nodes. >>> * begin communication in TOKEN >>> >>> Step 2) One node raises [FAILED TO RECEIVE]. >>> >>> Step 3) 12 nodes begin the reconfiguration of the cluster again. >>> >>> Step 4) The node that occurred fails([FAILED TO RECEIVE]) in an consensus >>> of the JOIN >> communication. >>> * Because the node failed in an consensus, node make contents of faildlist >>> and proclist same. >>> * And this node compares faildlist with proclist and assert-fail happened. >>> >>> >>> When the node that made a cluster stood alone, I think that assert() is >>> unnecessary. >>> >>> Because the reason is because there is the next processing. >>> >>> >> >> Have a try of the patch i have sent to this ml. If the issue persists, >> we can look at more options. >> >> Thanks! >> -steve >> >> >>> >>> static void memb_join_process ( >>> struct totemsrp_instance *instance, >>> const struct memb_join *memb_join) >>> { >>> struct srp_addr *proc_list; >>> struct srp_addr *failed_list; >>> (snip) >>> instance->failed_to_recv = 0; >>> srp_addr_copy (&instance->my_proc_list[0], >>> &instance->my_id); >>> instance->my_proc_list_entries = 1; >>> instance->my_failed_list_entries = 0; >>> >>> memb_state_commit_token_create (instance); >>> >>> memb_state_commit_enter (instance); >>> return; >>> >>> (snip) >>> >>> Best Regards, >>> Hideo Yamauchi. >>> >>> >>> >>> --- [email protected] wrote: >>> >>>> Hi Steven, >>>> >>>>> Hideo, >>>>> >>>>> If you have a test case, I can make a patch for you to try. >>>>> >>>> >>>> All right. >>>> >>>> We use corosync.1.3.0. >>>> >>>> Please send me patch. >>>> >>>> Best Regards, >>>> Hideo Yamauchi. >>>> >>>> --- Steven Dake <[email protected]> wrote: >>>> >>>>> On 02/06/2011 09:16 PM, [email protected] wrote: >>>>>> Hi Steven, >>>>>> Hi Dejan, >>>>>> >>>>>>>>>> This code never got a chance to run because on failed_to_recv >>>>>>>>>> the two sets (my_process_list and my_failed_list) are equal which >>>>>>>>>> makes the assert fail in memb_consensus_agreed(): >>>>>> >>>>>> The same problem occurs, and we are troubled, too. >>>>>> >>>>>> How did this argument turn out? >>>>>> >>>>>> Best Regards, >>>>>> Hideo Yamauchi. >>>>>> >>>>> >>>>> Hideo, >>>>> >>>>> If you have a test case, I can make a patch for you to try. >>>>> >>>>> Regards >>>>> -steve >>>>> >>>>>> >>>>>> --- Dejan Muhamedagic <[email protected]> wrote: >>>>>> >>>>>>> nudge, nudge >>>>>>> >>>>>>> On Wed, Jan 05, 2011 at 02:05:55PM +0100, Dejan Muhamedagic wrote: >>>>>>>> On Tue, Jan 04, 2011 at 01:53:00PM -0700, Steven Dake wrote: >>>>>>>>> On 12/23/2010 06:14 AM, Dejan Muhamedagic wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On Wed, Dec 01, 2010 at 05:30:44PM +0200, Vladislav Bogdanov wrote: >>>>>>>>>>> 01.12.2010 16:32, Dejan Muhamedagic wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Nov 23, 2010 at 12:53:42PM +0200, Vladislav Bogdanov wrote: >>>>>>>>>>>>> Hi Steven, hi all. >>>>>>>>>>>>> >>>>>>>>>>>>> I often see this assert on one of nodes after I stop corosync on >>>>>>>>>>>>> some >>>>>>>>>>>>> another node in newly-setup 4-node cluster. >>>>>>>>>>>> >>>>>>>>>>>> Does the assert happen on a node lost event? Or once new >>>>>>>>>>>> partition is formed? >>>>>>>>>>> >>>>>>>>>>> I first noticed it when I rebooted another node, just after console >>>>>>>>>>> said >>>>>>>>>>> that OpenAIS is stopped. >>>>>>>>>>> >>>>>>>>>>> Can't say right now, what exactly event did it follow, I'm actually >>>>>>>>>>> fighting with several problems with corosync, pacemaker, NFS4 and >>>>>>>>>>> phantom uncorrectable ECC errors simultaneously and I'm a bit lost >>>>>>>>>>> with >>>>>>>>>>> all of them. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> #0 0x00007f51953e49a5 in raise () from /lib64/libc.so.6 >>>>>>>>>>>>> #1 0x00007f51953e6185 in abort () from /lib64/libc.so.6 >>>>>>>>>>>>> #2 0x00007f51953dd935 in __assert_fail () from /lib64/libc.so.6 >>>>>>>>>>>>> #3 0x00007f5196176406 in memb_consensus_agreed >>>>>>>>>>>>> (instance=0x7f5196554010) at totemsrp.c:1194 >>>>>>>>>>>>> #4 0x00007f519617b2f3 in memb_join_process >>>>>>>>>>>>> (instance=0x7f5196554010, >>>>>>>>>>>>> memb_join=0x262f628) at totemsrp.c:3918 >>>>>>>>>>>>> #5 0x00007f519617b619 in message_handler_memb_join >>>>>>>>>>>>> (instance=0x7f5196554010, msg=<value optimized out>, >>>>>>>>>>>>> msg_len=<value >>>>>>>>>>>>> optimized out>, endian_conversion_needed=<value optimized out>) >>>>>>>>>>>>> at totemsrp.c:4161 >>>>>>>>>>>>> #6 0x00007f5196173ba7 in passive_mcast_recv >>>>>>>>>>>>> (rrp_instance=0x2603030, >>>>>>>>>>>>> iface_no=0, context=<value optimized out>, msg=<value optimized >>>>>>>>>>>>> out>, >>>>>>>>>>>>> msg_len=<value optimized out>) at totemrrp.c:720 >>>>>>>>>>>>> #7 0x00007f5196172b44 in rrp_deliver_fn (context=<value >>>>>>>>>>>>> optimized out>, >>>>>>>>>>>>> msg=0x262f628, msg_len=420) at totemrrp.c:1404 >>>>>>>>>>>>> #8 0x00007f5196171a76 in net_deliver_fn (handle=<value optimized >>>>>>>>>>>>> out>, >>>>>>>>>>>>> fd=<value optimized out>, revents=<value optimized out>, >>>>>>>>>>>>> data=0x262ef80) >>>>>>>>>>>>> at totemudp.c:1244 >>>>>>>>>>>>> #9 0x00007f519616d7f2 in poll_run (handle=4858364909567606784) at >>>>>>>>>>>>> coropoll.c:510 >>>>>>>>>>>>> #10 0x0000000000406add in main (argc=<value optimized out>, >>>>>>>>>>>>> argv=<value >>>>>>>>>>>>> optimized out>, envp=<value optimized out>) at main.c:1680 >>>>>>>>>>>>> >>>>>>>>>>>>> Last fplay lines are: >>>>>>>>>>>>> >>>>>>>>>>>>> rec=[36124] Log Message=Delivering MCAST message with seq 1366 to >>>>>>>>>>>>> pending delivery queue >>>>>>>>>>>>> rec=[36125] Log Message=Delivering MCAST message with seq 1367 to >>>>>>>>>>>>> pending delivery queue >>>>>>>>>>>>> rec=[36126] Log Message=Received ringid(10.5.4.52:12660) seq 1366 >>>>>>>>>>>>> rec=[36127] Log Message=Received ringid(10.5.4.52:12660) seq 1367 >>>>>>>>>>>>> rec=[36128] Log Message=Received ringid(10.5.4.52:12660) seq 1366 >>>>>>>>>>>>> rec=[36129] Log Message=Received ringid(10.5.4.52:12660) seq 1367 >>>>>>>>>>>>> rec=[36130] Log Message=releasing messages up to and including >>>>>>>>>>>>> 1367 >>>>>>>>>>>>> rec=[36131] Log Message=FAILED TO RECEIVE >>>>>>>>>>>>> rec=[36132] Log Message=entering GATHER state from 6. >>>>>>>>>>>>> rec=[36133] Log Message=entering GATHER state from 0. >>>>>>>>>>>>> Finishing replay: records found [33993] >>>>>>>>>>>>> >>>>>>>>>>>>> What could be the reason for this? Bug, switches, memory errors? >>>>>>>>>>>> >>>>>>>>>>>> The assertion fails because corosync finds out that >>>>>>>>>>>> instance->my_proc_list and instance->my_failed_list are >>>>>>>>>>>> equal. That happens immediately after the "FAILED TO RECEIVE" >>>>>>>>>>>> message which is issued when fail_recv_const token rotations >>>>>>>>>>>> happened without any multicast packet received (defaults to 50). >>>>>>>>>> >>>>>>>>>> I took a look at the code and the protocol specification again >>>>>>>>>> and it seems like that assert is not valid since Steve patched >>>>>>>>>> the part dealing with the "FAILED TO RECEIVE" condition. The >>>>>>>>>> patch is from 2010-06-03 posted to the list here >>>>>>>>>> http://marc.info/?l=openais&m=127559807608484&w=2 >>>>>>>>>> >>>>>>>>>> The last hunk of the patch contains this code (exec/totemsrp.c): >>>>>>>>>> >>>>>>>>>> 3933 if (memb_consensus_agreed (instance) && >>>>>>>>>> instance->failed_to_recv == 1) { >> >>>> >>>>>>> >>>>>>>>>> 3934 instance->failed_to_recv = 0; >>>>>>>>>> 3935 srp_addr_copy (&instance->my_proc_list[0], >>>>>>>>>> 3936 &instance->my_id); >>>>>>>>>> 3937 instance->my_proc_list_entries = 1; >>>>>>>>>> 3938 instance->my_failed_list_entries = 0; >>>>>>>>>> 3939 >>>>>>>>>> 3940 memb_state_commit_token_create (instance); >>>>>>>>>> 3941 >>>>>>>>>> 3942 memb_state_commit_enter (instance); >>>>>>>>>> 3943 return; >>>>>>>>>> 3944 } >>>>>>>>>> >>>>>>>>>> This code never got a chance to run because on failed_to_recv >>>>>>>>>> the two sets (my_process_list and my_failed_list) are equal which >>>>>>>>>> makes the assert fail in memb_consensus_agreed(): >>>>>>>>>> >>>>>>>>>> 1185 memb_set_subtract (token_memb, &token_memb_entries, >>>>>>>>>> 1186 instance->my_proc_list, instance->my_proc_list_entries, >>>>>>>>>> 1187 instance->my_failed_list, >>>>>>>>>> instance->my_failed_list_entries); >>>>>>>>>> ... >>>>>>>>>> 1195 assert (token_memb_entries >= 1); >>>>>>>>>> >>>>>>>>>> In other words, it's something like this: >>>>>>>>>> >>>>>>>>>> if A: >>>>>>>>>> if memb_consensus_agreed() and failed_to_recv: >>>>>>>>>> form a single node ring and try to recover >>>>>>>>>> >>>>>>>>>> memb_consensus_agreed(): >>>>>>>>>> assert(!A) >>>>>>>>>> >>>>>>>>>> Steve, can you take a look and confirm that this holds. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>> >>>>>>>>> Dejan, >>>>>>>>> >>>>>>>>> sorry for delay in response - big backlog which is mostly cleared out >>>>>>>>> :) >>>>>>>> >>>>>>>> No problem. >>>>>>>> >>>>>>>>> The assert definitely isn't correct, but removing it without >>>>>>>>> addressing >>>>>>>>> the contents of the proc and fail lists is also not right. That would >>>>>>>>> cause the logic in the if statement at line 3933 not to be executed >>>>>>>>> (because the first part of the if would evaluate to false) >>>>>>>> >>>>>>>> Actually it wouldn't. The agreed variable is set to 1 and it >>>>>>>> is going to be returned unchanged. >>>>>>>> >>>>>>>>> I believe >>>>>>>>> what we should do is check the "failed_to_recv" value in >>>>>>>>> memb_consensus_agreed instead of at line 3933. >>>>>>>>> >>>>>>>>> The issue with this is memb_state_consensus_timeout_expired which also >>>>>>>>> executes some 'then' logic where we may not want to execute the >>>>>>>>> failed_to_recv logic. >>>>>>>> >>>>>>>> Perhaps we should just >>>>>>>> >>>>>>>> 3933 if (instance->failed_to_recv == 1) { >>>>>>>> >>>>>>>> ? In case failed_to_recv both proc and fail lists are equal so >>>>>>>> checking for memb_consensus_agreed won't make sense, right? >>>>>>>> >>>>>>>>> If anyone has a reliable reproducer and can forward to me, I'll test >>>>>>>>> out >>>>>>>>> a change to address this problem. Really hesitant to change anything >>>>>>>>> in >>>>>>>>> totemsrp without a test case for this problem - its almost perfect ;-) >>>>>>>> >>>>>>>> Since the tester upgraded the switch firmware they couldn't >>>>>>>> reproduce it anymore. >>>>>>>> >>>>>>>> Would compiling with these help? >>>>>>>> >>>>>>>> /* >>>>>>>> * These can be used to test the error recovery algorithms >>>>>>>> * #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30 >>>>>>>> * #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30 >>>>>>>> * #define TEST_DROP_MCAST_PERCENTAGE 50 >>>>>>>> * #define TEST_RECOVERY_MSG_COUNT 300 >>>>>>>> */ >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Dejan >>>>>>>> >>>>>>>>> Regards >>>>>>>>> -steve >>>>>>>>> >>>>>>>>>> Dejan >>>>>>>>>> _______________________________________________ >>>>>>>>>> Openais mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Openais mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>>> _______________________________________________ >>>>>>>> Openais mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>> _______________________________________________ >>>>>>> Openais mailing list >>>>>>> [email protected] >>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Openais mailing list >>>>>> [email protected] >>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Openais mailing list >>>> [email protected] >>>> https://lists.linux-foundation.org/mailman/listinfo/openais >>>> >>> >>> _______________________________________________ >>> Openais mailing list >>> [email protected] >>> https://lists.linux-foundation.org/mailman/listinfo/openais >> >> > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
