Volker Lendecke wrote:
On Mon, Apr 21, 2008 at 09:13:28AM -0500, James A. Dinkel wrote:
Anyway, the server will be fine and snappy for a week or so, then out of
the blue, nobody can connect. Top shows a few smbd processes maxing out
the cpu and the load (which is usually < 1.0) gradually climbs up to 10,
I've seen this only when something like connections.tdb
became corrupt. With CentOS this is not likely, but reiserfs
did that to me fairly often. What filesystem are your tdbs
residing on? Maybe some other kernel-level problem like a
problematic driver in the path to the hard disk?
Volker
I have seen this once on a CentOS-4.5-x86_64 box; IIRC, there was an
issue with the Intel e1000 kernel module that caused a high number of
connection resets,
but the RSTs never made it back, so the connections would just time out
while the client started a new connection. Then again, this box was
using reiserfs to hold the tdbs, and it might have just been a fsck on
reboot that fixed it when I rebooted after applying the kernel module
update... anyways, what I was seeing was a consistently high number
(several hundred) of queued packets for the sendQ across a dozen or so
connections, and groups of reset connections all happening at the same
time. The load went up slowly for about a day, and then rocketed to
well over 100 when a client was reset with a stuck locked file.
FWIW, this was a SMP Xeon box w/ integrated Intel E1000s and the
(mostly) stock 2.6.9-12(?) RHEL kernel. I had found that Intel did have
a patch for an issue very similar to what I was seeing, and after
applying it, everything was happy again.
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/listinfo/samba