On 06/01/2010 10:26 AM, David Dillow wrote:
> On Tue, 2010-06-01 at 09:04 -0700, Steven Dake wrote:
>> The most I have tested physically is 48 nodes.  I can't offer any advice
>> on what to tune beyond that, other then increase join and increase
>> consensus to much larger values then they are currently set.
>>
>> Some good options might be
>> join: 150
>> token: 5000
>> consensus: 20000
>>
>> Note I am hesitant to think that corosync will work will in its current
>> form at 70 node count.
>
> Ok, I'll give those a shot.
>
>>> As for the segfault, it is the result of totempg_deliver_fn() being
>>> handed an encapsulated packet and then misinterpreting it. This was
>>> handed down from messages_deliver_to_app(), and based on the flow around
>>> deliver_messages_from_recovery_to_regular() I expect that it should not
>>> see encapsulated messages. Looking through the core dump, the
>>> multi-encapsulated message is from somewhat ancient ring instances: the
>>> current ringid seq is 38260, and the outer encapsulation is for seq
>>> 38204 with an inner encapsulation of seq 38124. It seems this node was
>>> last operation in ring 38204, and had entered recovery state a number of
>>> times without landing in operational again prior to the crash.
>>
>> It is normal for these ring messages to be recovered, but perhaps there
>> is some error in how the recovery is functioning.
>
> Certainly, it doesn't look like there should ever be encapsulated
> messages on the regular ring, only the recovery ring. Somehow, we're
> getting messages on the regular ring with at least one, if not two
> levels of encapsulation.
>

There should never be an encapsulated message in a regular ring.  The 
ring id problem I spoke about later in this mail would explain why that 
encapsulated message would come into in regular ring.

> Also, should we be recovering messages from ringids that are not our
> immediate ancestor?
>
>> 1.2.2 contains several fixes related to lossy messaging and one segfault
>> in particular that occurs when recovery is interrupted by new memberships.
>
> Ok, I'll recheck the changes between the two.
>
>>> While working with pacemaker prior to focusing on corosync, I noticed on
>>> several occasions where corosync would get into a situation were all
>>> nodes of the cluster were considered members of the ring, but some nodes
>>> were working with sequence numbers that were several hundred behind
>>> everyone else, and did not catch up. I have not seen this in a
>>> corosync-only test, but I suspect it may be related to the segfault
>>> above -- it only seemed to occur after a pass through the recovery state.
>>>
>>
>> I would expect that is normal on a lossy network (ie: if there are
>> retransmits in your network).  With 48 nodes all sending messages, it is
>> possible for one node during high load to have lower seqids because it
>> hasn't yet received or processed the seqids.
>
> The network rarely has retransmits -- and this was a stable
> configuration. I'm working a bit from memory here, as I concentrated on
> the segfault issue. I'll keep an eye out for this occurrence and see if
> I can collect more data.
>
>>> Any suggestions on how to proceed to put this bug to bed?
>>>
>>
>> I would try 1.2.4 (pending).  1.2.2 and 1.2.3 exhibit problems with
>> logging on some platforms.
>
> Alrighty, will give that a go.
>
> Thanks!
> Dave
>

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to