Re: rdump stuck in sbwait state (RELENG_7)

Robert Watson Mon, 05 Jan 2009 06:17:28 -0800

On Mon, 5 Jan 2009, Terry Kennedy wrote:

I may have missed this earlier in the thread, but I don't see a kernelstack trace of the stuck thread/process. Could you grab one using procstat-k, DDB, or KGDB? I'd like to confirm that the 'sbwait' really reflectswaiting to send, rather than waiting to receive, which (for better orworse) uses the same wmesg. procstat -k may be the simplest of the aboveto do if your system is reasonable recent.
I didn't post that earlier as no-one had asked for it 8-)


Indeed :-).

The system is current as of December 29th. Here's the relevant info:


Could I ask you to also send me procstat -f output?

More below the quote.

(0:10) test4:/sysprog/terry# uname -a
FreeBSD test4.tmk.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Mon Dec 2911:48:04 EST 2008 [email protected]:/usr/obj/usr/src/sys/PE1550 i386
(0:11) test4:/sysprog/terry# ps -axwww | grep dump
UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT       TIME COMMAND
0 4436 4411 0 8 0 35896 34552 wait I+ p1 0:00.70/sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump)0 4439 4436 0 4 0 35896 34784 sbwait I+ p1 0:03.05 rdump:/dev/amrd0s1f: pass 4: 18.48% done, finished in 0:17 at Sat Jan 3 21:02:052009 (rdump)0 4440 4439 0 20 0 35896 34624 pause I+ p1 0:05.26/sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump)0 4441 4439 0 20 0 35896 34624 pause I+ p1 0:05.26/sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump)0 4442 4439 0 4 0 35896 34624 sbwait I+ p1 0:05.26/sbin/rdump 0uLa -b 64 -C 32 -f server /usr (rdump)
(0:12) test4:/sysprog/terry# procstat -k 4436
PID TID COMM TDNAME KSTACK4436 100115 rdump - mi_switch sleepq_switchsleepq_catch_signals sleepq_wait_sig _sleep kern_wait wait4 syscallXint0x80_syscall (0:13) test4:/sysprog/terry# procstat -k 4439PID TID COMM TDNAME KSTACK4439 100127 rdump - mi_switch sleepq_switchsleepq_catch_signals sleepq_wait_sig _sleep sbwait soreceive_genericsoreceive soo_read dofileread kern_readv read syscall Xint0x80_syscall (0:14)test4:/sysprog/terry# procstat -k 4440PID TID COMM TDNAME KSTACK4440 100131 rdump - mi_switch sleepq_switchsleepq_catch_signals sleepq_wait_sig _sleep kern_sigsuspend sigsuspendsyscall Xint0x80_syscall (0:15) test4:/sysprog/terry# procstat -k 4441PID TID COMM TDNAME KSTACK4441 100105 rdump - mi_switch sleepq_switchsleepq_catch_signals sleepq_wait_sig _sleep kern_sigsuspend sigsuspendsyscall Xint0x80_syscall (0:16) test4:/sysprog/terry# procstat -k 4442PID TID COMM TDNAME KSTACK4442 100135 rdump - mi_switch sleepq_switchsleepq_catch_signals sleepq_wait_sig _sleep sbwait soreceive_genericsoreceive soo_read dofileread kern_readv read syscall Xint0x80_syscall

As I understand it, the processes in sbwait state are waiting to receive.That would seem to indicate that they don't see the ACKs from the other end,despite the tcpdump showing that they were received.

In general, being blocked in soreceive() means that the application at theother end hasn't sent data, or the other end hasn't received or correctlyprocessed ACKs from the local end, so isn't sending more data that it hasqueued up. The condition you describe sounds more like what would happen in asender: that it has data to send, but the remote side hasn't ACK'dsufficiently to send it all. If you have kgdb handy, it would be useful tolook at *so and *so->so_domain in the soreceive_generic frame of proc 4439.If it's an inet socket, we'd like to see *(struct inpcb *)so->so_pcb, and ifit's a TCP socket, *(struct tcpcb *)((struct inpcb *)so->so_pcb)->inp_ppcb.


Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"

Re: rdump stuck in sbwait state (RELENG_7)

Reply via email to