On 05/11/10 01:47 AM, MichaelHoy wrote:
Same thing happened again I'm sorry to say.
This doesn't seem like anything that would be related to smbd.
Why do I suspect smdb - simply because proving access to disk via CIFS is the
significant activity; agreed its very circumstantial.
The smb service is just a pipeline between the network stack
and the file system; it doesn't access the network interfaces
or the disks directly. The network stack communicates with
the network interface drivers and the file system communicates
with the storage subsystem, which in turn communicates with
disk drivers.
A bug in the smb service might cause the service to stop working
or perhaps crash the operating system but it is really unlikely
to lock up the entire system.
For assistance with a system lockup, such as you've described,
it would be better to ask for debug assistance on some of the
other forums - perhaps storage-discuss or a network related list.
Alan
Server hung and power off only option.
That's unfortunate. Are there any core files in /, /root or /var/tmp or clues
in /var/adm/messages.*?
I've checked and there are no cores.
I've checked the messages log and all those in /var/svc/log but they don't
point me to anything obvious.
I had a number off ssh sessions running.
A curious thing was all were unresponsive except the one running top which was
happily chugging away and totting up usage.
TOP output starts --------------
last pid: 2863; load avg: 0.00, 0.00, 0.01; up 5+05:07:06 17:28:20
70 processes: 69 sleeping, 1 on cpu
CPU states: 99.9% idle, 0.0% user, 0.1% kernel, 0.0% iowait, 0.0% swap
Kernel: 243 ctxsw, 1 trap, 487 intr, 209 syscall, 1 flt
Memory: 4094M phys mem, 429M free mem, 2046M total swap, 2046M free swap
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
570 root 3 59 0 72M 41M sleep 9:46 0.03% Xorg
1390 gdm 1 59 0 136M 20M sleep 6:19 0.01% gdm-simple-gree
537 root 1 59 0 12M 6148K sleep 2:10 0.01% intrd
2183 root 1 59 0 4092K 2616K cpu/2 1:02 0.01% top
11 root 18 59 0 13M 12M sleep 0:12 0.00% svc.configd
461 root 426 59 0 21M 14M sleep 18.6H 0.00% smbd
1384 gdm 2 59 0 98M 37M sleep 1:29 0.00% gnome-settings-
214 root 56 59 0 12M 5056K sleep 0:12 0.00% nscd
759 it019 1 59 0 16M 5820K sleep 0:15 0.00% sshd
400 daemon 22 59 0 21M 11M sleep 0:25 0.00% idmapd
626 root 1 59 0 6436K 2104K sleep 0:08 0.00% sendmail
168 root 6 59 0 17M 6020K sleep 0:10 0.00% devfsadm
479 noaccess 1 59 0 2620K 1524K sleep 0:04 0.00% mdnsd
477 root 20 59 0 30M 14M sleep 0:33 0.00% fmd
436 root 4 59 0 9512K 8040K sleep 0:08 0.00% hald
550 root 1 59 0 3464K 1964K sleep 0:06 0.00% hald-addon-acpi
9 root 14 59 0 18M 10M sleep 0:05 0.00% svc.startd
last pid: 2863; load avg: 0.00, 0.00, 0.00; up 5+05:11:36
17:32:50
TOP output ends --------------
I've configured the system to send to a syslog server to assist in determining
a sequence. As far as I can tell, all was well until 16:16:22 on 10/05/2010.
Up until that time a daemon.debug dialogue is logged between the server, DNS
and AD whereby DNS is queried (Found _ldap_tcp...),
time synch'd with a AD server (it alternates between which one it chooses) and
a reference logged each to
smbrdr_ntcreatex: 14 \srvsvc and SmbRdrNtCreate: fid=6 (a new fid appears to be
created at each invocation up to a max of 49152).
This dialogue happens approx every 10 minutes.
After this time, two new entries appear at the same time - smbrdr_ntcreatex: 18
\netlogon and thereafter, no DNS activity is logged and the fid for the
SmbRdrNtCreate seems to jump from 6 when the issues appears
to have occurred to 16386 within 90 minutes.
Any assistance would be appreciated.
Thanks in advance. Michael.
_______________________________________________
cifs-discuss mailing list
cifs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/cifs-discuss