Re: [networking-discuss] connect(2) blocks for more than 250 ms over loopback

Iwan Aucamp Tue, 22 Sep 2009 08:32:53 -0700

Hi

Attached is the script used to invoke dtrace - note that the only placea pid predicate is used is at entry to connect - from there all systemand kernel functions are traced.

So the traces should be both sides of the conversation - which alsoexplains a few things (note the application connected does not accept -it only listens to reduce noise).


Regards
-------- Original Message  --------

Subject: Re: [networking-discuss] connect(2) blocks for more than 250 msover loopback

From: Iwan Aucamp <[email protected]>
To: [email protected]
Cc: [email protected]
Date: Tue Sep 22 2009 15:17:01 GMT+0200 (SAST)

Hi
Thanks for the response brian, The is also the first thing I noticed- but from what I understand of Solaris is that system calls onsockets in blocking mode is removed from active context - and thereason it came back into context here is because it received a signalwhich caused it to return. The reason I think it actually goes out ofcontext is that it is waiting for some part of TCP handshaking tocomplete.
The timestamps are indeed wallclock timestamps in %Y-%m-%d %H:%M:%Sformat followed by .<wallclock nano sec> and then followed by :deltasince previous line.
Will follow your advice regarding trace on other end and see if wehave some kind of support contract with sun.
Regards

-------- Original Message  --------
Subject: Re: [networking-discuss] connect(2) blocks for more than 250ms over loopbackFrom: Brian Ruthven - Solaris Network Sustaining - Sun UK<[email protected]>
To: Iwan Aucamp <[email protected]>
Cc: [email protected]
Date: Tue Sep 22 2009 13:45:14 GMT+0200 (SAST)
I've only skimmed through this (actually, only searched for "250"!),but this looks like an interesting part:
2009-09-21 15:08:06.673356800:+0.000ms->(fbt:unix:1:resume:entry):()2009-09-21 15:08:06.673356800:+0.000ms->(fbt:genunix:1:savectx:entry):()2009-09-21 15:08:06.673356800:+0.000ms->(fbt:genunix:1:schedctl_save:entry):()2009-09-21 15:08:06.673356800:+0.000ms<-(fbt:genunix:1:schedctl_save:return):(2890546872320)2009-09-21 15:08:06.673356800:+0.000ms<-(fbt:genunix:1:savectx:return):(3298820544480)2009-09-21 15:08:06.923373568:+250.017ms->(fbt:genunix:1:restorectx:entry):()2009-09-21 15:08:06.923373568:+0.000ms->(fbt:genunix:1:schedctl_restore:entry):()2009-09-21 15:08:06.923373568:+0.000ms<-(fbt:genunix:1:schedctl_restore:return):(2890546872320)2009-09-21 15:08:06.923373568:+0.000ms<-(fbt:genunix:1:restorectx:return):(3298820544480)2009-09-21 15:08:06.923373568:+0.000ms<-(fbt:unix:1:resume:return):(3298820544480)
You don't say what the timestamp values are, but I'm assuming theyare absolute (wall clock) time followed by a delta since the last fbtprobe.
If so, then 250ms was spent in resume() between the calls to savectxand restorectx.As best I can tell, the stack trace at this point looks somethinglike this:
resume()
cv_wait_sig_swap_core()
sowaitconnected()
sotpi_connect()
connect()
Although this is a network-related symptom, I suspect your problem isdeeper in the scheduler code.Having said that, there is a comment above resume (seehttp://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/sun4/ml/swtch.s#45for the OpenSolaris equivalent) which hints that resume() can spinunder the right conditions. You may need to enhance your dtrace toinclude CPU information (to confirm whether this thread stayed on thesame cpu), and may also need to trace what the other app is doingduring the connect() call from the client.
You should continue this with your Solution Centre (through yourSolaris 10 support contract).
Regards,
Brian



Iwan Aucamp wrote:
Hi
On a T2000 node running Solaris 10 u5 connect(2) over loopback to anapplication which is completely idle blocks for longer than 250 ms.
The system load averages when this occurs is also below 2 andanother system (same os, hardware and software) with much higher CPUload does have this problem.
In an attempt to figure out what is causing this I ran dtrace withfbt:: and syscall:connect:, extracting all fbt:: when a connectfails (application calling connect times out after 250ms using alarmso connect fails with EINTR).
The results of this is attached:

connect.ok.verbose        : successfull connect, verbose, for reference
connect.ok : successfull connect, for reference (canbe used with diff against connect.timeout)
connect.timeout.verbose   : timed-out connect, verbose
connect.timeout : timed-out connect (can be used with diffagainst connect.ok)
The major difference starts at:

connect.timeout:
246:                          -> (fbt:ip:1:ip_wput_local:entry)
247:                            -> (fbt:ip:1:ip_fanout_tcp:entry)
250: ->(fbt:ip:1:squeue_enter_nodrain:entry)
connect.ok:
240:                          -> (fbt:ip:1:ip_wput_local:entry)
241:                            -> (fbt:ip:1:ip_fanout_tcp:entry)
244: ->(fbt:ip:1:squeue_enter_nodrain:entry)
From here
connect.timeout goes to fbt:genunix:timeout andfbt:genunix:timeout_common.
As far as i can figure a TCP SYN packet is sent over IP but a TCPSYN-ACK never comes back.
Does this seem like the correct interpretation, and does anybodyhave any ideas regarding this ?
Regards
------------------------------------------------------------------------
_______________________________________________
networking-discuss mailing list
[email protected]


_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] connect(2) blocks for more than 250 ms over loopback

Reply via email to