FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?

2014-03-17 Thread Karl Pielorz


Hi,

I setup a VM a while ago under XenServer 6.2 - it's an amd64 FBSD 10.0-S 
box (based on r261289) - it's running under Xen PVHVM.


I set it up - and left it for a while (46 days). I went to ssh to it today, 
and got:



ssh_exchange_identification: Connection closed by remote host


Getting on to the boxes console there's lots of what look to be 'stuck' 
sshd processes?


0  3933  895 0  20  0 84868 6944 urdlck Is -  0:00.01 sshd: unknown [priv] 
(sshd)

22  3934 3933 0  20  0 00 -  Z  -  0:00.00 defunct
0  3935 3933 0  20  0 84868 6952 sbwait I  -  0:00.00 sshd: unknown [pam] 
(sshd)
0  4338  895 0  20  0 84868 6944 urdlck Is -  0:00.01 sshd: unknown [priv] 
(sshd)

22  4339 4338 0  20  0 00 -  Z  -  0:00.00 defunct


Anyone know what 'urdlck' is?

There's 126 of these processes, e.g.

5446  -  I 0:00.00 sshd: unknown [pam] (sshd)
5450  -  Is0:00.01 sshd: unknown [priv] (sshd)
5452  -  I 0:00.00 sshd: unknown [pam] (sshd)
5453  -  Is0:00.01 sshd: unknown [priv] (sshd)


Bearing in mind the box is firewalled from ssh access, and no one (apart 
from me today) has attempted to get onto the box with ssh - this is a 
little concerning. Every about 5-10 connects will actually 'connect' - the 
rest just result in the 'key exchange' error.


Kernel is GENERIC with:

  options NO_ADAPTIVE_MUTEXES
  options NO_ADAPTIVE_RWLOCKS
  options NO_ADAPTIVE_SX

There's nothing logged in dmesg, or syslog.

Is there anything worth doing to this VM before I restart it - i.e. to try 
and figure out what's happened? / troubleshoot?


-Karl

___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org


Re: FBSD 10.0-S (r261289M) under XenServer 6.2 - Stuck sshd in urdlck?

2014-03-17 Thread Karl Pielorz



--On 17 March 2014 18:17:53 +0100 Roger Pau Monné roger@citrix.com 
wrote:



Anyone know what 'urdlck' is?


It seems like the process is stuck while trying to acquire a rw mutex in
read mode. Could you obtain a backtrace of the process with gdb?


Ok, I think I did this right - let me know if I've not...

# gdb /usr/sbin/sshd 5325
...
Attaching to program: /usr/sbin/sshd, process 5325

warning: current_sos: Can't read pathname for load map: Bad address
[repeated several times]
[lots of reading symbols from - 'no debugging symbols found' output]
...
[New Thread 804006400 (LWP 100184/sshd)]
[a few reading symbols - 'no debugging symbols found' output]
Loaded symbols for /libexec/ld-elf.so.1
[Switching to Thread 804006400 (LWP 100184/sshd)]
0x0008038eb89c in __error () from /lib/libthr.so.3
(gdb) bt
#0  0x0008038eb89c in __error () from /lib/libthr.so.3
#1  0x0008038e921c in pthread_timedjoin_np () from /lib/libthr.so.3
#2  0x00080064f9a2 in _rtld_get_stack_prot () from /libexec/ld-elf.so.1
#3  0x0008006498c9 in r_debug_state () from /libexec/ld-elf.so.1
#4  0x0008006470cd in .text () from /libexec/ld-elf.so.1
#5  0x0246 in ?? ()
#6  0x in ?? ()



Also a
kernel-space dump might be useful, could you also run procstat -k pid?


procstat output is:


# procstat -k 5334
 PIDTID COMM TDNAME   KSTACK
5334 100183 sshd -mi_switch 
sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_rw_rdlock 
__umtx_op_rw_rdlock amd64_syscall Xfast_syscall



If you can briefly tell me how to do the kernel-space dump? Do I panic the 
machine (i.e. cause a crash-dump?) somehow?


Cheers  thanks for your reply,

-Karl

___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to freebsd-xen-unsubscr...@freebsd.org