Re: [tipc-discussion] RE : Re: Link related question/issue

Xpl++ Tue, 04 Mar 2008 11:11:28 -0800

Hi,

So .. what about that TODO comment in tipc_link.c regarding the stronger 
seq# checking and stuff? :)
Since I managed to stabilize my cluster I must proceed with a software 
upgrade (deadlines :( ...) and will be able to start looking into the 
link code sometime tomorrow evening. In the mean time any ideas as to 
where/what to look at would be highly appreciated ;)


Regards,
Peter.

Jon Paul Maloy ??????:
> Hi,
> Your analysis makes sense, but it still doesn't
> explain why TIPC cannot handle this quite commonplace
> situation.
> Yesterday, I forgot one essential detail: Even State
> messages contain info to help the receiver detect a
> gap. The "next_sent" sequence number tells the
> receiver if it is out of synch with the sender, and
> gives it a chance to send a NACK (a State with gap !=
> 0). Since State-packets clearly are received,
> otherwise the link would go down, there must be some
> bug in tipc that causes the gap to be calculated
> wrong, or not at all. Neither does it look like the
> receiver is sending a State _immediately_ after a gap
> has occurred, which it should.
> So, I think we are looking for some serious bug within
> tipc that completely cripples the retransmission
> protocol. We should try to backtrack and find out in
> which version it has been introduced.
>
> ///jon
>
>
> --- Xpl++ <[EMAIL PROTECTED]> a écrit :
>
>   
>> Hi,
>>
>> Some more info about my systems:
>> - all nodes that tend to drop packets are quite
>> loaded, thou very rarely 
>> one can see cpu #0 being 100% busy
>> - there are also few multithreaded tasks that are
>> bound to cpu#0 and 
>> running in SCHED_RR. All of them use tipc. None of
>> them uses the maximum 
>> scheduler priority and they use very little cpu time
>> and do not tend to 
>> make any peaks
>> - there is one taks that runs in SCHED_RR at maximum
>> priority 99/RT (it 
>> really does a very very important job), which uses
>> around 1ms of cpu, 
>> every 4 seconds, and it is explicitly bound to cpu#0
>> - all other tasks (mostly apache & php/perl) are
>> free to run on any cpu
>> - all of these nodes also have considerable io load.
>> - kernel has irq balancing and prety much all irq
>> are balanced, except 
>> for nic irqs. They are always services by cpu #0
>> - to create the packet drop issue I have to mildly
>> stress the node, 
>> which would normaly mean a moment when apache would
>> try to start some 
>> extra childred, that would also cause the number of
>> simultaneously 
>> running php script to also rise, while at the same
>> time the incoming 
>> network traffic is also rising. The stress is
>> preceeded by a few seconds 
>> of high input packet rate which may be causing evene
>> more stress on the 
>> scheduler and cpu starvation
>> - wireshark is dropping packets (surprising many, as
>> it seems), tipc is 
>> confused .. and all is related to moments of general
>> cpu starvation and 
>> an even worse one at cpu#0
>>
>> Then it all started adding up ..
>> I moved all non SCHED_OTHER tasks to other cpus, as
>> well as few other 
>> services. The result - 30% of the nodes showed
>> between 5 and 200 packets 
>> dropped for the whole stress routine, which had not
>> affected TIPC 
>> operation, nametables were in sync, all
>> communications seem to work 
>> properly.
>> Thou this solves my problems, it is still very
>> unclear what may have 
>> been happening in the kernel and in the tipc stack
>> that is causing this 
>> bizzare behavior.
>> SMP systems alone are tricky, and when adding load
>> and pseudo-realtime 
>> tasks situation seems to become really complicated.
>> One really cool thing to note is that Opteron based
>> nodes handle hi load 
>> and cpu starvation much better than Xeon ones ..
>> which only confirms an 
>> old observation of mine, that for some reason (that
>> must be the 
>> design/architecture?) Opterons appear _much_ more
>> interactive/responsive 
>> than Xeons under heavy load ..
>> Another note, this on TIPC - link window for 100mbit
>> nets should be at 
>> least 256 if one wants to do any serious
>> communication between a dozen 
>> or more nodes. Also for a gbit net link windows
>> above 1024 seem to 
>> really confuse the stack when face with high output
>> packet rate.
>>
>> Regards,
>> Peter Litov.
>>
>>
>> Martin Peylo ??????:
>>     
>>> Hi,
>>>
>>> I'll try to help with the Wireshark side of this
>>>       
>> problem.
>>     
>>> On 3/4/08, Jon Maloy <[EMAIL PROTECTED]>
>>>       
>> wrote:
>>     
>>>   
>>>       
>>>>  Strangely enough, node 1.1.12 continues to ack
>>>>         
>> packets
>>     
>>>>  which we don't see in wireshark (is it possible
>>>>         
>> that
>>     
>>>>  wireshark can miss packets?). It goes on acking
>>>>         
>> packets
>>     
>>>>  up to the one with sequence number 53967, (on of
>>>>         
>> the
>>     
>>>>  "invisible" packets, but from there on it is
>>>>         
>> stop.
>>     
>>>>     
>>>>         
>>> I've never encountered Wireshark missing packets
>>>       
>> so far. While it
>>     
>>> sounds as it wouldn't be a problem with the TIPC
>>>       
>> dissector, could you
>>     
>>> please send me a trace file so I can definitely
>>>       
>> exclude this cause of
>>     
>>> defect? I've tried to get it from the link quoted
>>>       
>> in the mail from Jon
>>     
>>> but it seems it was already removed.
>>>
>>>   
>>>       
>>>>  [...]
>>>>     
>>>>         
>>>   
>>>       
>>>>  As a sum of this, I start to suspect your
>>>>         
>> Ethernet
>>     
>>>>  driver. It seems like it sometimes delivers
>>>>         
>> packets
>>     
>>>>  to TIPC which it does not deliver to Wireshark,
>>>>         
>> and
>>     
>>>>  vice versa. This seems to happen after a period
>>>>         
>> of
>>     
>>>>  high traffic, and only with messages beyond a
>>>>         
>> certain
>>     
>>>>  size, since the State  messages always go
>>>>         
>> through.
>>     
>>>>  Can you see any pattern in the direction the
>>>>         
>> links
>>     
>>>>  go stale, with reference to which driver you are
>>>>  using. (E.g., is there always an e1000 driver
>>>>         
>> involved
>>     
>>>>  on the receiving end in the stale direction?)
>>>>  Does this happen when you only run one type of
>>>>         
>> driver?
>>     
>>>>     
>>>>         
>>> I've not yet gone that deep into package capture,
>>>       
>> so I can't say much
>>     
>>> about that. Peter, could you send a mail to one of
>>>       
>> the Wireshark
>>     
>>> mailing lists describing the problem? Have you
>>>       
>> tried capturing other
>>     
>>> kinds of high traffic with less ressource hungy
>>>       
>> capture frontends?
>>     
>>> Best regards,
>>> Martin
>>>
>>>
>>>   
>>>       
>>     
> -------------------------------------------------------------------------
>   
>> This SF.net email is sponsored by: Microsoft
>> Defy all challenges. Microsoft(R) Visual Studio
>> 2008.
>>
>>     
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
>   
>> _______________________________________________
>> tipc-discussion mailing list
>> [email protected]
>>
>>     
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>   
>
>   

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] RE : Re: Link related question/issue

Reply via email to