Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
On 17/03/14 19:44, Karl Pielorz wrote: --On 17 March 2014 18:17:53 +0100 Roger Pau Monné roger@citrix.com wrote: Anyone know what 'urdlck' is? It seems like the process is stuck while trying to acquire a rw mutex in read mode. Could you obtain a backtrace of the process with gdb? Ok, I think I did this right - let me know if I've not... # gdb /usr/sbin/sshd 5325 ... Attaching to program: /usr/sbin/sshd, process 5325 warning: current_sos: Can't read pathname for load map: Bad address [repeated several times] [lots of reading symbols from - 'no debugging symbols found' output] ... [New Thread 804006400 (LWP 100184/sshd)] [a few reading symbols - 'no debugging symbols found' output] Loaded symbols for /libexec/ld-elf.so.1 [Switching to Thread 804006400 (LWP 100184/sshd)] 0x0008038eb89c in __error () from /lib/libthr.so.3 (gdb) bt #0 0x0008038eb89c in __error () from /lib/libthr.so.3 #1 0x0008038e921c in pthread_timedjoin_np () from /lib/libthr.so.3 #2 0x00080064f9a2 in _rtld_get_stack_prot () from /libexec/ld-elf.so.1 #3 0x0008006498c9 in r_debug_state () from /libexec/ld-elf.so.1 #4 0x0008006470cd in .text () from /libexec/ld-elf.so.1 #5 0x0246 in ?? () #6 0x in ?? () Also a kernel-space dump might be useful, could you also run procstat -k pid? procstat output is: # procstat -k 5334 PIDTID COMM TDNAME KSTACK 5334 100183 sshd -mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_rw_rdlock __umtx_op_rw_rdlock amd64_syscall Xfast_syscall If you can briefly tell me how to do the kernel-space dump? Do I panic the machine (i.e. cause a crash-dump?) somehow? The output of vmstat -ai might also be helpful to assure that event timers are working correctly. Also, does this VM have some PCI-passthrough? Did you migrate it? The xl configuration file used to create the domain would also be interesting. Also, could you try the same workload with a pristine GENERIC kernel? Roger. ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org
Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
Em 18/03/2014, às 06:47, Karl Pielorz kpielorz_...@tdx.co.uk escreveu: --On 18 March 2014 09:44 +0100 Roger Pau Monné roger@citrix.com wrote: The output of vmstat -ai might also be helpful to assure that event timers are working correctly. Ok, that's at the bottom of this email. Also, does this VM have some PCI-passthrough? Did you migrate it? The xl configuration file used to create the domain would also be interesting. VM has no passthrough (this is a completely separate system to the one I was doing the passthrough work on), and was created 'from new' - from Xen Center I did VM, new VM I used the Other install media template with 2 vCPU's and 2048Mb of RAM - and fed it the FreeBSD 10.0-R amd64 install ISO. Single GPT partition, ufs - and a small (2Gb SWAP partition). Once the system was running I installed subversion then did a svn checkout of 'stable' source - rebuilt the world/kernel / installed the kernel/world - and did the usual mergemaster updates. Also, could you try the same workload with a pristine GENERIC kernel? Ok - I'll leave this VM 'as-is' If I get time I'll setup two new VM's - a stock 10.0-R with stock GENERIC, and another one using the r261289M source from this machine, but stock GENERIC kernel (the only mods I'd made were the 'NO_ADAPTIVE_' changes as recommended by the xen man page). It may be a while before I know if sshd is going to get 'stuck' again - but I'll keep an eye on them. -Karl This problem seems to not be in virtualization, the FUG-BR are discussing this issue in SSH FreeBSD10. ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org
Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
--On 18 March 2014 07:14 -0300 Tiago Ribeiro sha...@gmail.com wrote: This problem seems to not be in virtualization, the FUG-BR are discussing this issue in SSH FreeBSD10. Hmmm that's interesting... None of our other 10.x boxes (bare metal) have had this issue yet - but being fair they're usually busy and at the moment get restarted regularly - so might be why we've not seen it on those. FUG-BR = Grupo Brasileiro de Usuarios de FreeBSD? - Do you know if they're going to raise the issue on the FreeBSD lists? [if they haven't already?] Thanks, -Karl ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org
Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
--On 18 March 2014 08:42:00 -0300 Tiago Ribeiro sha...@gmail.com wrote: I think some people report this error in oder list. Here is part of email send to list: # ps afx 15961 - Is0:00.02 sshd: wendell [priv] (sshd) 15962 - Z 0:00.00 defunct # uname -a FreeBSD bacula 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 Thanks, I found the thread using the above on -stable and have posted to that (it didn't mention urdlck - which is why I probably missed it the first time round with Google et'al). Regards and thanks for the replies, -Karl ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org
FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
Hi, I setup a VM a while ago under XenServer 6.2 - it's an amd64 FBSD 10.0-S box (based on r261289) - it's running under Xen PVHVM. I set it up - and left it for a while (46 days). I went to ssh to it today, and got: ssh_exchange_identification: Connection closed by remote host Getting on to the boxes console there's lots of what look to be 'stuck' sshd processes? 0 3933 895 0 20 0 84868 6944 urdlck Is - 0:00.01 sshd: unknown [priv] (sshd) 22 3934 3933 0 20 0 00 - Z - 0:00.00 defunct 0 3935 3933 0 20 0 84868 6952 sbwait I - 0:00.00 sshd: unknown [pam] (sshd) 0 4338 895 0 20 0 84868 6944 urdlck Is - 0:00.01 sshd: unknown [priv] (sshd) 22 4339 4338 0 20 0 00 - Z - 0:00.00 defunct Anyone know what 'urdlck' is? There's 126 of these processes, e.g. 5446 - I 0:00.00 sshd: unknown [pam] (sshd) 5450 - Is0:00.01 sshd: unknown [priv] (sshd) 5452 - I 0:00.00 sshd: unknown [pam] (sshd) 5453 - Is0:00.01 sshd: unknown [priv] (sshd) Bearing in mind the box is firewalled from ssh access, and no one (apart from me today) has attempted to get onto the box with ssh - this is a little concerning. Every about 5-10 connects will actually 'connect' - the rest just result in the 'key exchange' error. Kernel is GENERIC with: options NO_ADAPTIVE_MUTEXES options NO_ADAPTIVE_RWLOCKS options NO_ADAPTIVE_SX There's nothing logged in dmesg, or syslog. Is there anything worth doing to this VM before I restart it - i.e. to try and figure out what's happened? / troubleshoot? -Karl ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org
Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?
--On 17 March 2014 18:17:53 +0100 Roger Pau Monné roger@citrix.com wrote: Anyone know what 'urdlck' is? It seems like the process is stuck while trying to acquire a rw mutex in read mode. Could you obtain a backtrace of the process with gdb? Ok, I think I did this right - let me know if I've not... # gdb /usr/sbin/sshd 5325 ... Attaching to program: /usr/sbin/sshd, process 5325 warning: current_sos: Can't read pathname for load map: Bad address [repeated several times] [lots of reading symbols from - 'no debugging symbols found' output] ... [New Thread 804006400 (LWP 100184/sshd)] [a few reading symbols - 'no debugging symbols found' output] Loaded symbols for /libexec/ld-elf.so.1 [Switching to Thread 804006400 (LWP 100184/sshd)] 0x0008038eb89c in __error () from /lib/libthr.so.3 (gdb) bt #0 0x0008038eb89c in __error () from /lib/libthr.so.3 #1 0x0008038e921c in pthread_timedjoin_np () from /lib/libthr.so.3 #2 0x00080064f9a2 in _rtld_get_stack_prot () from /libexec/ld-elf.so.1 #3 0x0008006498c9 in r_debug_state () from /libexec/ld-elf.so.1 #4 0x0008006470cd in .text () from /libexec/ld-elf.so.1 #5 0x0246 in ?? () #6 0x in ?? () Also a kernel-space dump might be useful, could you also run procstat -k pid? procstat output is: # procstat -k 5334 PIDTID COMM TDNAME KSTACK 5334 100183 sshd -mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_rw_rdlock __umtx_op_rw_rdlock amd64_syscall Xfast_syscall If you can briefly tell me how to do the kernel-space dump? Do I panic the machine (i.e. cause a crash-dump?) somehow? Cheers thanks for your reply, -Karl ___ freebsd-xen@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-xen To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org