On 02/06/2011 09:16 PM, [email protected] wrote:
> Hi Steven,
> Hi Dejan,
> 
>>>>> This code never got a chance to run because on failed_to_recv
>>>>> the two sets (my_process_list and my_failed_list) are equal which
>>>>> makes the assert fail in memb_consensus_agreed():
> 
> The same problem occurs, and we are troubled, too. 
> 
> How did this argument turn out?
> 
> Best Regards,
> Hideo Yamauchi.
> 

Hideo,

If you have a test case, I can make a patch for you to try.

Regards
-steve

> 
> --- Dejan Muhamedagic <[email protected]> wrote:
> 
>> nudge, nudge
>>
>> On Wed, Jan 05, 2011 at 02:05:55PM +0100, Dejan Muhamedagic wrote:
>>> On Tue, Jan 04, 2011 at 01:53:00PM -0700, Steven Dake wrote:
>>>> On 12/23/2010 06:14 AM, Dejan Muhamedagic wrote:
>>>>> Hi,
>>>>>
>>>>> On Wed, Dec 01, 2010 at 05:30:44PM +0200, Vladislav Bogdanov wrote:
>>>>>> 01.12.2010 16:32, Dejan Muhamedagic wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Tue, Nov 23, 2010 at 12:53:42PM +0200, Vladislav Bogdanov wrote:
>>>>>>>> Hi Steven, hi all.
>>>>>>>>
>>>>>>>> I often see this assert on one of nodes after I stop corosync on some
>>>>>>>> another node in newly-setup 4-node cluster.
>>>>>>>
>>>>>>> Does the assert happen on a node lost event? Or once new
>>>>>>> partition is formed?
>>>>>>
>>>>>> I first noticed it when I rebooted another node, just after console said
>>>>>> that OpenAIS is stopped.
>>>>>>
>>>>>> Can't say right now, what exactly event did it follow, I'm actually
>>>>>> fighting with several problems with corosync, pacemaker, NFS4 and
>>>>>> phantom uncorrectable ECC errors simultaneously and I'm a bit lost with
>>>>>> all of them.
>>>>>>
>>>>>>>
>>>>>>>> #0  0x00007f51953e49a5 in raise () from /lib64/libc.so.6
>>>>>>>> #1  0x00007f51953e6185 in abort () from /lib64/libc.so.6
>>>>>>>> #2  0x00007f51953dd935 in __assert_fail () from /lib64/libc.so.6
>>>>>>>> #3  0x00007f5196176406 in memb_consensus_agreed
>>>>>>>> (instance=0x7f5196554010) at totemsrp.c:1194
>>>>>>>> #4  0x00007f519617b2f3 in memb_join_process (instance=0x7f5196554010,
>>>>>>>> memb_join=0x262f628) at totemsrp.c:3918
>>>>>>>> #5  0x00007f519617b619 in message_handler_memb_join
>>>>>>>> (instance=0x7f5196554010, msg=<value optimized out>, msg_len=<value
>>>>>>>> optimized out>, endian_conversion_needed=<value optimized out>)
>>>>>>>>     at totemsrp.c:4161
>>>>>>>> #6  0x00007f5196173ba7 in passive_mcast_recv (rrp_instance=0x2603030,
>>>>>>>> iface_no=0, context=<value optimized out>, msg=<value optimized out>,
>>>>>>>> msg_len=<value optimized out>) at totemrrp.c:720
>>>>>>>> #7  0x00007f5196172b44 in rrp_deliver_fn (context=<value optimized 
>>>>>>>> out>,
>>>>>>>> msg=0x262f628, msg_len=420) at totemrrp.c:1404
>>>>>>>> #8  0x00007f5196171a76 in net_deliver_fn (handle=<value optimized out>,
>>>>>>>> fd=<value optimized out>, revents=<value optimized out>, 
>>>>>>>> data=0x262ef80)
>>>>>>>> at totemudp.c:1244
>>>>>>>> #9  0x00007f519616d7f2 in poll_run (handle=4858364909567606784) at
>>>>>>>> coropoll.c:510
>>>>>>>> #10 0x0000000000406add in main (argc=<value optimized out>, argv=<value
>>>>>>>> optimized out>, envp=<value optimized out>) at main.c:1680
>>>>>>>>
>>>>>>>> Last fplay lines are:
>>>>>>>>
>>>>>>>> rec=[36124] Log Message=Delivering MCAST message with seq 1366 to
>>>>>>>> pending delivery queue
>>>>>>>> rec=[36125] Log Message=Delivering MCAST message with seq 1367 to
>>>>>>>> pending delivery queue
>>>>>>>> rec=[36126] Log Message=Received ringid(10.5.4.52:12660) seq 1366
>>>>>>>> rec=[36127] Log Message=Received ringid(10.5.4.52:12660) seq 1367
>>>>>>>> rec=[36128] Log Message=Received ringid(10.5.4.52:12660) seq 1366
>>>>>>>> rec=[36129] Log Message=Received ringid(10.5.4.52:12660) seq 1367
>>>>>>>> rec=[36130] Log Message=releasing messages up to and including 1367
>>>>>>>> rec=[36131] Log Message=FAILED TO RECEIVE
>>>>>>>> rec=[36132] Log Message=entering GATHER state from 6.
>>>>>>>> rec=[36133] Log Message=entering GATHER state from 0.
>>>>>>>> Finishing replay: records found [33993]
>>>>>>>>
>>>>>>>> What could be the reason for this? Bug, switches, memory errors?
>>>>>>>
>>>>>>> The assertion fails because corosync finds out that
>>>>>>> instance->my_proc_list and instance->my_failed_list are
>>>>>>> equal. That happens immediately after the "FAILED TO RECEIVE"
>>>>>>> message which is issued when fail_recv_const token rotations
>>>>>>> happened without any multicast packet received (defaults to 50).
>>>>>
>>>>> I took a look at the code and the protocol specification again
>>>>> and it seems like that assert is not valid since Steve patched
>>>>> the part dealing with the "FAILED TO RECEIVE" condition. The
>>>>> patch is from 2010-06-03 posted to the list here
>>>>> http://marc.info/?l=openais&m=127559807608484&w=2
>>>>>
>>>>> The last hunk of the patch contains this code (exec/totemsrp.c):
>>>>>
>>>>> 3933         if (memb_consensus_agreed (instance) && 
>>>>> instance->failed_to_recv == 1) {     
>>
>>>>> 3934                 instance->failed_to_recv = 0;
>>>>> 3935                 srp_addr_copy (&instance->my_proc_list[0],
>>>>> 3936                     &instance->my_id);
>>>>> 3937                 instance->my_proc_list_entries = 1;
>>>>> 3938                 instance->my_failed_list_entries = 0;
>>>>> 3939            
>>>>> 3940                 memb_state_commit_token_create (instance);
>>>>> 3941            
>>>>> 3942                 memb_state_commit_enter (instance);
>>>>> 3943                 return;
>>>>> 3944         }
>>>>>
>>>>> This code never got a chance to run because on failed_to_recv
>>>>> the two sets (my_process_list and my_failed_list) are equal which
>>>>> makes the assert fail in memb_consensus_agreed():
>>>>>
>>>>> 1185     memb_set_subtract (token_memb, &token_memb_entries,
>>>>> 1186         instance->my_proc_list, instance->my_proc_list_entries,
>>>>> 1187         instance->my_failed_list, instance->my_failed_list_entries);
>>>>> ...
>>>>> 1195     assert (token_memb_entries >= 1);
>>>>>
>>>>> In other words, it's something like this:
>>>>>
>>>>>   if A:
>>>>>           if memb_consensus_agreed() and failed_to_recv:
>>>>>                   form a single node ring and try to recover
>>>>>
>>>>>   memb_consensus_agreed():
>>>>>           assert(!A)
>>>>>
>>>>> Steve, can you take a look and confirm that this holds.
>>>>>
>>>>> Cheers,
>>>>>
>>>>
>>>> Dejan,
>>>>
>>>> sorry for delay in response - big backlog which is mostly cleared out :)
>>>
>>> No problem.
>>>
>>>> The assert definitely isn't correct, but removing it without addressing
>>>> the contents of the proc and fail lists is also not right.  That would
>>>> cause the logic in the if statement at line 3933 not to be executed
>>>> (because the first part of the if would evaluate to false)
>>>
>>> Actually it wouldn't. The agreed variable is set to 1 and it
>>> is going to be returned unchanged.
>>>
>>>> I believe
>>>> what we should do is check the "failed_to_recv" value in
>>>> memb_consensus_agreed instead of at line 3933.
>>>>
>>>> The issue with this is memb_state_consensus_timeout_expired which also
>>>> executes some 'then' logic where we may not want to execute the
>>>> failed_to_recv logic.
>>>
>>> Perhaps we should just
>>>
>>> 3933         if (instance->failed_to_recv == 1) {
>>>
>>> ? In case failed_to_recv both proc and fail lists are equal so
>>> checking for memb_consensus_agreed won't make sense, right?
>>>
>>>> If anyone has a reliable reproducer and can forward to me, I'll test out
>>>> a change to address this problem.  Really hesitant to change anything in
>>>> totemsrp without a test case for this problem - its almost perfect ;-)
>>>
>>> Since the tester upgraded the switch firmware they couldn't
>>> reproduce it anymore.
>>>
>>> Would compiling with these help?
>>>
>>> /*
>>>  * These can be used to test the error recovery algorithms
>>>  * #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30
>>>  * #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30
>>>  * #define TEST_DROP_MCAST_PERCENTAGE 50
>>>  * #define TEST_RECOVERY_MSG_COUNT 300
>>>  */
>>>
>>> Cheers,
>>>
>>> Dejan
>>>
>>>> Regards
>>>> -steve
>>>>
>>>>> Dejan
>>>>> _______________________________________________
>>>>> Openais mailing list
>>>>> [email protected]
>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>
>>>> _______________________________________________
>>>> Openais mailing list
>>>> [email protected]
>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>> _______________________________________________
>>> Openais mailing list
>>> [email protected]
>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>> _______________________________________________
>> Openais mailing list
>> [email protected]
>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>
> 
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to