Hi Steven,

> Have a try of the patch i have sent to this ml.  If the issue persists,
> we can look at more options.

Thank you for comment.

Is your patch 2 of the next?

 * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the 
token

 * [Openais] [PATCH] When a failed to recv state happens,stop forwarding the 
token(take 2)

Best Regards,
Hideo Yamauchi.



--- Steven Dake <sd...@redhat.com> wrote:

> On 02/07/2011 11:11 PM, renayama19661...@ybb.ne.jp wrote:
> > Hi Steven,
> > 
> > I understood your opinion by mistake.
> > 
> > We do not have simple test case.
> > 
> > The phenomenon generated in our environment is the following thing.
> > 
> > Step 1) corosync constitutes a cluster in 12 nodes.
> >  * begin communication in TOKEN
> > 
> > Step 2) One node raises [FAILED TO RECEIVE].
> > 
> > Step 3) 12 nodes begin the reconfiguration of the cluster again.
> > 
> > Step 4) The node that occurred fails([FAILED TO RECEIVE]) in an consensus 
> > of the JOIN
> communication.
> >  * Because the node failed in an consensus, node make contents of faildlist 
> > and proclist same.
> >  * And this node compares faildlist with proclist and assert-fail happened.
> > 
> > 
> > When the node that made a cluster stood alone, I think that assert() is 
> > unnecessary.
> > 
> > Because the reason is because there is the next processing.
> > 
> > 
> 
> Have a try of the patch i have sent to this ml.  If the issue persists,
> we can look at more options.
> 
> Thanks!
> -steve
> 
> 
> > 
> > static void memb_join_process (
> >     struct totemsrp_instance *instance,
> >     const struct memb_join *memb_join)
> > {
> >     struct srp_addr *proc_list;
> >     struct srp_addr *failed_list;
> > (snip)
> >                             instance->failed_to_recv = 0;
> >                             srp_addr_copy (&instance->my_proc_list[0],
> >                                     &instance->my_id);
> >                             instance->my_proc_list_entries = 1;
> >                             instance->my_failed_list_entries = 0;
> > 
> >                             memb_state_commit_token_create (instance);
> > 
> >                             memb_state_commit_enter (instance);
> >                             return;
> > 
> > (snip)
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > 
> > --- renayama19661...@ybb.ne.jp wrote:
> > 
> >> Hi Steven,
> >>
> >>> Hideo,
> >>>
> >>> If you have a test case, I can make a patch for you to try.
> >>>
> >>
> >> All right.
> >>
> >> We use corosync.1.3.0.
> >>
> >> Please send me patch.
> >>
> >> Best Regards,
> >> Hideo Yamauchi.
> >>
> >> --- Steven Dake <sd...@redhat.com> wrote:
> >>
> >>> On 02/06/2011 09:16 PM, renayama19661...@ybb.ne.jp wrote:
> >>>> Hi Steven,
> >>>> Hi Dejan,
> >>>>
> >>>>>>>> This code never got a chance to run because on failed_to_recv
> >>>>>>>> the two sets (my_process_list and my_failed_list) are equal which
> >>>>>>>> makes the assert fail in memb_consensus_agreed():
> >>>>
> >>>> The same problem occurs, and we are troubled, too. 
> >>>>
> >>>> How did this argument turn out?
> >>>>
> >>>> Best Regards,
> >>>> Hideo Yamauchi.
> >>>>
> >>>
> >>> Hideo,
> >>>
> >>> If you have a test case, I can make a patch for you to try.
> >>>
> >>> Regards
> >>> -steve
> >>>
> >>>>
> >>>> --- Dejan Muhamedagic <de...@suse.de> wrote:
> >>>>
> >>>>> nudge, nudge
> >>>>>
> >>>>> On Wed, Jan 05, 2011 at 02:05:55PM +0100, Dejan Muhamedagic wrote:
> >>>>>> On Tue, Jan 04, 2011 at 01:53:00PM -0700, Steven Dake wrote:
> >>>>>>> On 12/23/2010 06:14 AM, Dejan Muhamedagic wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On Wed, Dec 01, 2010 at 05:30:44PM +0200, Vladislav Bogdanov wrote:
> >>>>>>>>> 01.12.2010 16:32, Dejan Muhamedagic wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Nov 23, 2010 at 12:53:42PM +0200, Vladislav Bogdanov wrote:
> >>>>>>>>>>> Hi Steven, hi all.
> >>>>>>>>>>>
> >>>>>>>>>>> I often see this assert on one of nodes after I stop corosync on 
> >>>>>>>>>>> some
> >>>>>>>>>>> another node in newly-setup 4-node cluster.
> >>>>>>>>>>
> >>>>>>>>>> Does the assert happen on a node lost event? Or once new
> >>>>>>>>>> partition is formed?
> >>>>>>>>>
> >>>>>>>>> I first noticed it when I rebooted another node, just after console 
> >>>>>>>>> said
> >>>>>>>>> that OpenAIS is stopped.
> >>>>>>>>>
> >>>>>>>>> Can't say right now, what exactly event did it follow, I'm actually
> >>>>>>>>> fighting with several problems with corosync, pacemaker, NFS4 and
> >>>>>>>>> phantom uncorrectable ECC errors simultaneously and I'm a bit lost 
> >>>>>>>>> with
> >>>>>>>>> all of them.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> #0  0x00007f51953e49a5 in raise () from /lib64/libc.so.6
> >>>>>>>>>>> #1  0x00007f51953e6185 in abort () from /lib64/libc.so.6
> >>>>>>>>>>> #2  0x00007f51953dd935 in __assert_fail () from /lib64/libc.so.6
> >>>>>>>>>>> #3  0x00007f5196176406 in memb_consensus_agreed
> >>>>>>>>>>> (instance=0x7f5196554010) at totemsrp.c:1194
> >>>>>>>>>>> #4  0x00007f519617b2f3 in memb_join_process 
> >>>>>>>>>>> (instance=0x7f5196554010,
> >>>>>>>>>>> memb_join=0x262f628) at totemsrp.c:3918
> >>>>>>>>>>> #5  0x00007f519617b619 in message_handler_memb_join
> >>>>>>>>>>> (instance=0x7f5196554010, msg=<value optimized out>, 
> >>>>>>>>>>> msg_len=<value
> >>>>>>>>>>> optimized out>, endian_conversion_needed=<value optimized out>)
> >>>>>>>>>>>     at totemsrp.c:4161
> >>>>>>>>>>> #6  0x00007f5196173ba7 in passive_mcast_recv 
> >>>>>>>>>>> (rrp_instance=0x2603030,
> >>>>>>>>>>> iface_no=0, context=<value optimized out>, msg=<value optimized 
> >>>>>>>>>>> out>,
> >>>>>>>>>>> msg_len=<value optimized out>) at totemrrp.c:720
> >>>>>>>>>>> #7  0x00007f5196172b44 in rrp_deliver_fn (context=<value 
> >>>>>>>>>>> optimized out>,
> >>>>>>>>>>> msg=0x262f628, msg_len=420) at totemrrp.c:1404
> >>>>>>>>>>> #8  0x00007f5196171a76 in net_deliver_fn (handle=<value optimized 
> >>>>>>>>>>> out>,
> >>>>>>>>>>> fd=<value optimized out>, revents=<value optimized out>, 
> >>>>>>>>>>> data=0x262ef80)
> >>>>>>>>>>> at totemudp.c:1244
> >>>>>>>>>>> #9  0x00007f519616d7f2 in poll_run (handle=4858364909567606784) at
> >>>>>>>>>>> coropoll.c:510
> >>>>>>>>>>> #10 0x0000000000406add in main (argc=<value optimized out>, 
> >>>>>>>>>>> argv=<value
> >>>>>>>>>>> optimized out>, envp=<value optimized out>) at main.c:1680
> >>>>>>>>>>>
> >>>>>>>>>>> Last fplay lines are:
> >>>>>>>>>>>
> >>>>>>>>>>> rec=[36124] Log Message=Delivering MCAST message with seq 1366 to
> >>>>>>>>>>> pending delivery queue
> >>>>>>>>>>> rec=[36125] Log Message=Delivering MCAST message with seq 1367 to
> >>>>>>>>>>> pending delivery queue
> >>>>>>>>>>> rec=[36126] Log Message=Received ringid(10.5.4.52:12660) seq 1366
> >>>>>>>>>>> rec=[36127] Log Message=Received ringid(10.5.4.52:12660) seq 1367
> >>>>>>>>>>> rec=[36128] Log Message=Received ringid(10.5.4.52:12660) seq 1366
> >>>>>>>>>>> rec=[36129] Log Message=Received ringid(10.5.4.52:12660) seq 1367
> >>>>>>>>>>> rec=[36130] Log Message=releasing messages up to and including 
> >>>>>>>>>>> 1367
> >>>>>>>>>>> rec=[36131] Log Message=FAILED TO RECEIVE
> >>>>>>>>>>> rec=[36132] Log Message=entering GATHER state from 6.
> >>>>>>>>>>> rec=[36133] Log Message=entering GATHER state from 0.
> >>>>>>>>>>> Finishing replay: records found [33993]
> >>>>>>>>>>>
> >>>>>>>>>>> What could be the reason for this? Bug, switches, memory errors?
> >>>>>>>>>>
> >>>>>>>>>> The assertion fails because corosync finds out that
> >>>>>>>>>> instance->my_proc_list and instance->my_failed_list are
> >>>>>>>>>> equal. That happens immediately after the "FAILED TO RECEIVE"
> >>>>>>>>>> message which is issued when fail_recv_const token rotations
> >>>>>>>>>> happened without any multicast packet received (defaults to 50).
> >>>>>>>>
> >>>>>>>> I took a look at the code and the protocol specification again
> >>>>>>>> and it seems like that assert is not valid since Steve patched
> >>>>>>>> the part dealing with the "FAILED TO RECEIVE" condition. The
> >>>>>>>> patch is from 2010-06-03 posted to the list here
> >>>>>>>> http://marc.info/?l=openais&m=127559807608484&w=2
> >>>>>>>>
> >>>>>>>> The last hunk of the patch contains this code (exec/totemsrp.c):
> >>>>>>>>
> >>>>>>>> 3933         if (memb_consensus_agreed (instance) && 
> >>>>>>>> instance->failed_to_recv == 1) {  
> 
> >>  
> >>>>>
> >>>>>>>> 3934                 instance->failed_to_recv = 0;
> >>>>>>>> 3935                 srp_addr_copy (&instance->my_proc_list[0],
> >>>>>>>> 3936                     &instance->my_id);
> >>>>>>>> 3937                 instance->my_proc_list_entries = 1;
> >>>>>>>> 3938                 instance->my_failed_list_entries = 0;
> >>>>>>>> 3939            
> >>>>>>>> 3940                 memb_state_commit_token_create (instance);
> >>>>>>>> 3941            
> >>>>>>>> 3942                 memb_state_commit_enter (instance);
> >>>>>>>> 3943                 return;
> >>>>>>>> 3944         }
> >>>>>>>>
> >>>>>>>> This code never got a chance to run because on failed_to_recv
> >>>>>>>> the two sets (my_process_list and my_failed_list) are equal which
> >>>>>>>> makes the assert fail in memb_consensus_agreed():
> >>>>>>>>
> >>>>>>>> 1185     memb_set_subtract (token_memb, &token_memb_entries,
> >>>>>>>> 1186         instance->my_proc_list, instance->my_proc_list_entries,
> >>>>>>>> 1187         instance->my_failed_list, 
> >>>>>>>> instance->my_failed_list_entries);
> >>>>>>>> ...
> >>>>>>>> 1195     assert (token_memb_entries >= 1);
> >>>>>>>>
> >>>>>>>> In other words, it's something like this:
> >>>>>>>>
> >>>>>>>>      if A:
> >>>>>>>>              if memb_consensus_agreed() and failed_to_recv:
> >>>>>>>>                      form a single node ring and try to recover
> >>>>>>>>
> >>>>>>>>      memb_consensus_agreed():
> >>>>>>>>              assert(!A)
> >>>>>>>>
> >>>>>>>> Steve, can you take a look and confirm that this holds.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>>
> >>>>>>>
> >>>>>>> Dejan,
> >>>>>>>
> >>>>>>> sorry for delay in response - big backlog which is mostly cleared out 
> >>>>>>> :)
> >>>>>>
> >>>>>> No problem.
> >>>>>>
> >>>>>>> The assert definitely isn't correct, but removing it without 
> >>>>>>> addressing
> >>>>>>> the contents of the proc and fail lists is also not right.  That would
> >>>>>>> cause the logic in the if statement at line 3933 not to be executed
> >>>>>>> (because the first part of the if would evaluate to false)
> >>>>>>
> >>>>>> Actually it wouldn't. The agreed variable is set to 1 and it
> >>>>>> is going to be returned unchanged.
> >>>>>>
> >>>>>>> I believe
> >>>>>>> what we should do is check the "failed_to_recv" value in
> >>>>>>> memb_consensus_agreed instead of at line 3933.
> >>>>>>>
> >>>>>>> The issue with this is memb_state_consensus_timeout_expired which also
> >>>>>>> executes some 'then' logic where we may not want to execute the
> >>>>>>> failed_to_recv logic.
> >>>>>>
> >>>>>> Perhaps we should just
> >>>>>>
> >>>>>> 3933         if (instance->failed_to_recv == 1) {
> >>>>>>
> >>>>>> ? In case failed_to_recv both proc and fail lists are equal so
> >>>>>> checking for memb_consensus_agreed won't make sense, right?
> >>>>>>
> >>>>>>> If anyone has a reliable reproducer and can forward to me, I'll test 
> >>>>>>> out
> >>>>>>> a change to address this problem.  Really hesitant to change anything 
> >>>>>>> in
> >>>>>>> totemsrp without a test case for this problem - its almost perfect ;-)
> >>>>>>
> >>>>>> Since the tester upgraded the switch firmware they couldn't
> >>>>>> reproduce it anymore.
> >>>>>>
> >>>>>> Would compiling with these help?
> >>>>>>
> >>>>>> /*
> >>>>>>  * These can be used to test the error recovery algorithms
> >>>>>>  * #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30
> >>>>>>  * #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30
> >>>>>>  * #define TEST_DROP_MCAST_PERCENTAGE 50
> >>>>>>  * #define TEST_RECOVERY_MSG_COUNT 300
> >>>>>>  */
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Dejan
> >>>>>>
> >>>>>>> Regards
> >>>>>>> -steve
> >>>>>>>
> >>>>>>>> Dejan
> >>>>>>>> _______________________________________________
> >>>>>>>> Openais mailing list
> >>>>>>>> Openais@lists.linux-foundation.org
> >>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Openais mailing list
> >>>>>>> Openais@lists.linux-foundation.org
> >>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>>>>> _______________________________________________
> >>>>>> Openais mailing list
> >>>>>> Openais@lists.linux-foundation.org
> >>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>>>> _______________________________________________
> >>>>> Openais mailing list
> >>>>> Openais@lists.linux-foundation.org
> >>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> Openais mailing list
> >>>> Openais@lists.linux-foundation.org
> >>>> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Openais mailing list
> >> Openais@lists.linux-foundation.org
> >> https://lists.linux-foundation.org/mailman/listinfo/openais
> >>
> > 
> > _______________________________________________
> > Openais mailing list
> > Openais@lists.linux-foundation.org
> > https://lists.linux-foundation.org/mailman/listinfo/openais
> 
> 

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to