Hi Steven, > Have a try of the patch i have sent to this ml. If the issue persists, > we can look at more options.
Thank you for comment. Is your patch 2 of the next? * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the token * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the token(take 2) Best Regards, Hideo Yamauchi. --- Steven Dake <sd...@redhat.com> wrote: > On 02/07/2011 11:11 PM, renayama19661...@ybb.ne.jp wrote: > > Hi Steven, > > > > I understood your opinion by mistake. > > > > We do not have simple test case. > > > > The phenomenon generated in our environment is the following thing. > > > > Step 1) corosync constitutes a cluster in 12 nodes. > > * begin communication in TOKEN > > > > Step 2) One node raises [FAILED TO RECEIVE]. > > > > Step 3) 12 nodes begin the reconfiguration of the cluster again. > > > > Step 4) The node that occurred fails([FAILED TO RECEIVE]) in an consensus > > of the JOIN > communication. > > * Because the node failed in an consensus, node make contents of faildlist > > and proclist same. > > * And this node compares faildlist with proclist and assert-fail happened. > > > > > > When the node that made a cluster stood alone, I think that assert() is > > unnecessary. > > > > Because the reason is because there is the next processing. > > > > > > Have a try of the patch i have sent to this ml. If the issue persists, > we can look at more options. > > Thanks! > -steve > > > > > > static void memb_join_process ( > > struct totemsrp_instance *instance, > > const struct memb_join *memb_join) > > { > > struct srp_addr *proc_list; > > struct srp_addr *failed_list; > > (snip) > > instance->failed_to_recv = 0; > > srp_addr_copy (&instance->my_proc_list[0], > > &instance->my_id); > > instance->my_proc_list_entries = 1; > > instance->my_failed_list_entries = 0; > > > > memb_state_commit_token_create (instance); > > > > memb_state_commit_enter (instance); > > return; > > > > (snip) > > > > Best Regards, > > Hideo Yamauchi. > > > > > > > > --- renayama19661...@ybb.ne.jp wrote: > > > >> Hi Steven, > >> > >>> Hideo, > >>> > >>> If you have a test case, I can make a patch for you to try. > >>> > >> > >> All right. > >> > >> We use corosync.1.3.0. > >> > >> Please send me patch. > >> > >> Best Regards, > >> Hideo Yamauchi. > >> > >> --- Steven Dake <sd...@redhat.com> wrote: > >> > >>> On 02/06/2011 09:16 PM, renayama19661...@ybb.ne.jp wrote: > >>>> Hi Steven, > >>>> Hi Dejan, > >>>> > >>>>>>>> This code never got a chance to run because on failed_to_recv > >>>>>>>> the two sets (my_process_list and my_failed_list) are equal which > >>>>>>>> makes the assert fail in memb_consensus_agreed(): > >>>> > >>>> The same problem occurs, and we are troubled, too. > >>>> > >>>> How did this argument turn out? > >>>> > >>>> Best Regards, > >>>> Hideo Yamauchi. > >>>> > >>> > >>> Hideo, > >>> > >>> If you have a test case, I can make a patch for you to try. > >>> > >>> Regards > >>> -steve > >>> > >>>> > >>>> --- Dejan Muhamedagic <de...@suse.de> wrote: > >>>> > >>>>> nudge, nudge > >>>>> > >>>>> On Wed, Jan 05, 2011 at 02:05:55PM +0100, Dejan Muhamedagic wrote: > >>>>>> On Tue, Jan 04, 2011 at 01:53:00PM -0700, Steven Dake wrote: > >>>>>>> On 12/23/2010 06:14 AM, Dejan Muhamedagic wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> On Wed, Dec 01, 2010 at 05:30:44PM +0200, Vladislav Bogdanov wrote: > >>>>>>>>> 01.12.2010 16:32, Dejan Muhamedagic wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> On Tue, Nov 23, 2010 at 12:53:42PM +0200, Vladislav Bogdanov wrote: > >>>>>>>>>>> Hi Steven, hi all. > >>>>>>>>>>> > >>>>>>>>>>> I often see this assert on one of nodes after I stop corosync on > >>>>>>>>>>> some > >>>>>>>>>>> another node in newly-setup 4-node cluster. > >>>>>>>>>> > >>>>>>>>>> Does the assert happen on a node lost event? Or once new > >>>>>>>>>> partition is formed? > >>>>>>>>> > >>>>>>>>> I first noticed it when I rebooted another node, just after console > >>>>>>>>> said > >>>>>>>>> that OpenAIS is stopped. > >>>>>>>>> > >>>>>>>>> Can't say right now, what exactly event did it follow, I'm actually > >>>>>>>>> fighting with several problems with corosync, pacemaker, NFS4 and > >>>>>>>>> phantom uncorrectable ECC errors simultaneously and I'm a bit lost > >>>>>>>>> with > >>>>>>>>> all of them. > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> #0 0x00007f51953e49a5 in raise () from /lib64/libc.so.6 > >>>>>>>>>>> #1 0x00007f51953e6185 in abort () from /lib64/libc.so.6 > >>>>>>>>>>> #2 0x00007f51953dd935 in __assert_fail () from /lib64/libc.so.6 > >>>>>>>>>>> #3 0x00007f5196176406 in memb_consensus_agreed > >>>>>>>>>>> (instance=0x7f5196554010) at totemsrp.c:1194 > >>>>>>>>>>> #4 0x00007f519617b2f3 in memb_join_process > >>>>>>>>>>> (instance=0x7f5196554010, > >>>>>>>>>>> memb_join=0x262f628) at totemsrp.c:3918 > >>>>>>>>>>> #5 0x00007f519617b619 in message_handler_memb_join > >>>>>>>>>>> (instance=0x7f5196554010, msg=<value optimized out>, > >>>>>>>>>>> msg_len=<value > >>>>>>>>>>> optimized out>, endian_conversion_needed=<value optimized out>) > >>>>>>>>>>> at totemsrp.c:4161 > >>>>>>>>>>> #6 0x00007f5196173ba7 in passive_mcast_recv > >>>>>>>>>>> (rrp_instance=0x2603030, > >>>>>>>>>>> iface_no=0, context=<value optimized out>, msg=<value optimized > >>>>>>>>>>> out>, > >>>>>>>>>>> msg_len=<value optimized out>) at totemrrp.c:720 > >>>>>>>>>>> #7 0x00007f5196172b44 in rrp_deliver_fn (context=<value > >>>>>>>>>>> optimized out>, > >>>>>>>>>>> msg=0x262f628, msg_len=420) at totemrrp.c:1404 > >>>>>>>>>>> #8 0x00007f5196171a76 in net_deliver_fn (handle=<value optimized > >>>>>>>>>>> out>, > >>>>>>>>>>> fd=<value optimized out>, revents=<value optimized out>, > >>>>>>>>>>> data=0x262ef80) > >>>>>>>>>>> at totemudp.c:1244 > >>>>>>>>>>> #9 0x00007f519616d7f2 in poll_run (handle=4858364909567606784) at > >>>>>>>>>>> coropoll.c:510 > >>>>>>>>>>> #10 0x0000000000406add in main (argc=<value optimized out>, > >>>>>>>>>>> argv=<value > >>>>>>>>>>> optimized out>, envp=<value optimized out>) at main.c:1680 > >>>>>>>>>>> > >>>>>>>>>>> Last fplay lines are: > >>>>>>>>>>> > >>>>>>>>>>> rec=[36124] Log Message=Delivering MCAST message with seq 1366 to > >>>>>>>>>>> pending delivery queue > >>>>>>>>>>> rec=[36125] Log Message=Delivering MCAST message with seq 1367 to > >>>>>>>>>>> pending delivery queue > >>>>>>>>>>> rec=[36126] Log Message=Received ringid(10.5.4.52:12660) seq 1366 > >>>>>>>>>>> rec=[36127] Log Message=Received ringid(10.5.4.52:12660) seq 1367 > >>>>>>>>>>> rec=[36128] Log Message=Received ringid(10.5.4.52:12660) seq 1366 > >>>>>>>>>>> rec=[36129] Log Message=Received ringid(10.5.4.52:12660) seq 1367 > >>>>>>>>>>> rec=[36130] Log Message=releasing messages up to and including > >>>>>>>>>>> 1367 > >>>>>>>>>>> rec=[36131] Log Message=FAILED TO RECEIVE > >>>>>>>>>>> rec=[36132] Log Message=entering GATHER state from 6. > >>>>>>>>>>> rec=[36133] Log Message=entering GATHER state from 0. > >>>>>>>>>>> Finishing replay: records found [33993] > >>>>>>>>>>> > >>>>>>>>>>> What could be the reason for this? Bug, switches, memory errors? > >>>>>>>>>> > >>>>>>>>>> The assertion fails because corosync finds out that > >>>>>>>>>> instance->my_proc_list and instance->my_failed_list are > >>>>>>>>>> equal. That happens immediately after the "FAILED TO RECEIVE" > >>>>>>>>>> message which is issued when fail_recv_const token rotations > >>>>>>>>>> happened without any multicast packet received (defaults to 50). > >>>>>>>> > >>>>>>>> I took a look at the code and the protocol specification again > >>>>>>>> and it seems like that assert is not valid since Steve patched > >>>>>>>> the part dealing with the "FAILED TO RECEIVE" condition. The > >>>>>>>> patch is from 2010-06-03 posted to the list here > >>>>>>>> http://marc.info/?l=openais&m=127559807608484&w=2 > >>>>>>>> > >>>>>>>> The last hunk of the patch contains this code (exec/totemsrp.c): > >>>>>>>> > >>>>>>>> 3933 if (memb_consensus_agreed (instance) && > >>>>>>>> instance->failed_to_recv == 1) { > > >> > >>>>> > >>>>>>>> 3934 instance->failed_to_recv = 0; > >>>>>>>> 3935 srp_addr_copy (&instance->my_proc_list[0], > >>>>>>>> 3936 &instance->my_id); > >>>>>>>> 3937 instance->my_proc_list_entries = 1; > >>>>>>>> 3938 instance->my_failed_list_entries = 0; > >>>>>>>> 3939 > >>>>>>>> 3940 memb_state_commit_token_create (instance); > >>>>>>>> 3941 > >>>>>>>> 3942 memb_state_commit_enter (instance); > >>>>>>>> 3943 return; > >>>>>>>> 3944 } > >>>>>>>> > >>>>>>>> This code never got a chance to run because on failed_to_recv > >>>>>>>> the two sets (my_process_list and my_failed_list) are equal which > >>>>>>>> makes the assert fail in memb_consensus_agreed(): > >>>>>>>> > >>>>>>>> 1185 memb_set_subtract (token_memb, &token_memb_entries, > >>>>>>>> 1186 instance->my_proc_list, instance->my_proc_list_entries, > >>>>>>>> 1187 instance->my_failed_list, > >>>>>>>> instance->my_failed_list_entries); > >>>>>>>> ... > >>>>>>>> 1195 assert (token_memb_entries >= 1); > >>>>>>>> > >>>>>>>> In other words, it's something like this: > >>>>>>>> > >>>>>>>> if A: > >>>>>>>> if memb_consensus_agreed() and failed_to_recv: > >>>>>>>> form a single node ring and try to recover > >>>>>>>> > >>>>>>>> memb_consensus_agreed(): > >>>>>>>> assert(!A) > >>>>>>>> > >>>>>>>> Steve, can you take a look and confirm that this holds. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> > >>>>>>> > >>>>>>> Dejan, > >>>>>>> > >>>>>>> sorry for delay in response - big backlog which is mostly cleared out > >>>>>>> :) > >>>>>> > >>>>>> No problem. > >>>>>> > >>>>>>> The assert definitely isn't correct, but removing it without > >>>>>>> addressing > >>>>>>> the contents of the proc and fail lists is also not right. That would > >>>>>>> cause the logic in the if statement at line 3933 not to be executed > >>>>>>> (because the first part of the if would evaluate to false) > >>>>>> > >>>>>> Actually it wouldn't. The agreed variable is set to 1 and it > >>>>>> is going to be returned unchanged. > >>>>>> > >>>>>>> I believe > >>>>>>> what we should do is check the "failed_to_recv" value in > >>>>>>> memb_consensus_agreed instead of at line 3933. > >>>>>>> > >>>>>>> The issue with this is memb_state_consensus_timeout_expired which also > >>>>>>> executes some 'then' logic where we may not want to execute the > >>>>>>> failed_to_recv logic. > >>>>>> > >>>>>> Perhaps we should just > >>>>>> > >>>>>> 3933 if (instance->failed_to_recv == 1) { > >>>>>> > >>>>>> ? In case failed_to_recv both proc and fail lists are equal so > >>>>>> checking for memb_consensus_agreed won't make sense, right? > >>>>>> > >>>>>>> If anyone has a reliable reproducer and can forward to me, I'll test > >>>>>>> out > >>>>>>> a change to address this problem. Really hesitant to change anything > >>>>>>> in > >>>>>>> totemsrp without a test case for this problem - its almost perfect ;-) > >>>>>> > >>>>>> Since the tester upgraded the switch firmware they couldn't > >>>>>> reproduce it anymore. > >>>>>> > >>>>>> Would compiling with these help? > >>>>>> > >>>>>> /* > >>>>>> * These can be used to test the error recovery algorithms > >>>>>> * #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30 > >>>>>> * #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30 > >>>>>> * #define TEST_DROP_MCAST_PERCENTAGE 50 > >>>>>> * #define TEST_RECOVERY_MSG_COUNT 300 > >>>>>> */ > >>>>>> > >>>>>> Cheers, > >>>>>> > >>>>>> Dejan > >>>>>> > >>>>>>> Regards > >>>>>>> -steve > >>>>>>> > >>>>>>>> Dejan > >>>>>>>> _______________________________________________ > >>>>>>>> Openais mailing list > >>>>>>>> Openais@lists.linux-foundation.org > >>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Openais mailing list > >>>>>>> Openais@lists.linux-foundation.org > >>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais > >>>>>> _______________________________________________ > >>>>>> Openais mailing list > >>>>>> Openais@lists.linux-foundation.org > >>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais > >>>>> _______________________________________________ > >>>>> Openais mailing list > >>>>> Openais@lists.linux-foundation.org > >>>>> https://lists.linux-foundation.org/mailman/listinfo/openais > >>>>> > >>>> > >>>> _______________________________________________ > >>>> Openais mailing list > >>>> Openais@lists.linux-foundation.org > >>>> https://lists.linux-foundation.org/mailman/listinfo/openais > >>> > >>> > >> > >> _______________________________________________ > >> Openais mailing list > >> Openais@lists.linux-foundation.org > >> https://lists.linux-foundation.org/mailman/listinfo/openais > >> > > > > _______________________________________________ > > Openais mailing list > > Openais@lists.linux-foundation.org > > https://lists.linux-foundation.org/mailman/listinfo/openais > > _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais