On Thu, Nov 13, 2008 at 03:12:35PM -0500, [EMAIL PROTECTED] wrote: > > I go to the subdirectory, via linux console, where the suspect file is > located and ls the directory.? 9 files.? ls -al gets Killed. After ls -al > filename for each of the 9 files, I determine that 5 of these files are badly > corrupt.? I perform an experiment.? Tell everyone to leave these files alone, > reboot the server and it runs happily for an hour.? Load is .05 average.? I > ask one user to attempt to open one of the corrupt files, and instantly all > 50 smbd daemons go to uninterruptible sleep and every WinXP client instantly > re-establishes its smbd session with the server and these (all 50) smbd > sessions also die and go to heaven.? This cycle continues rapidly sending the > load sky high with no cpu utilization to speak of.
Uninterruptible sleep == kernel problem. > Questions that remain: > 1.? Why do all client smbd daemons have to die if only one of them ran into > trouble? Once you have processes going into an uninterruptible state the system is dead. It might not have stopped moving yet, but it's dead. You have a kernel/filesystem issue you need to resolve. My guess is a bad disk. > 2.? How do files get in a state that they can't be viewed or managed?? virus, > lack of sunspots? Bad disk, probably. > 3.? Why did the fsck say that the filesystem was fine, when obviously it > isn't? Kernel bug ? > 4.? How to delete these poison files? Backup the filesystem without them, reformat, restore. Did you have hard disk hardware error reporting turned on ? It's not reasonable to expect smbd to survive errors of this magnitude I'm afraid. Once processes start going into a uninterruptible state there's no way for user space code to recover. Jeremy. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
