> Shouldn't kill -9 kill a process no matter what?

Ordinarily, this is true.  However, it is possible for a process to get 
stuck in a state where even kill -9 can't kill it.  I have seen this 
happen where a disk device (for example) fails in a spectacular way.  A 
process that was blocking on a read from that device can get into this 
disk wait state (ps shows it with a "D" in the process status column).  
In fact, it recently happened to me...

I have FreeBSD on my laptop, and FreeBSD on another machine here at 
home.  On the server machine, I keep a copy of the FreeBSD CVS 
repository, which I typically mount via NFS onto my other client 
machines.  I often suspend the laptop and take it to other locations 
(when you're a consultant, you're often expected to provide your own 
hardware).  I resume, work, and suspend on a regular basis.

Now, at some point, I suspended while this NFS filesystem was still 
mounted.  Reaching a remote location, I discovered that the filesystem 
was not available (obviously) and decided to unmount it.  Much to my 
dismay, umount got itself into a D state trying to unmount the drive 
(presumably there were buffers waiting to be flushed or some such).  
Ultimately, I had to reboot the laptop (no big deal, I'm forced to 
reboot periodically when the battery freaks out on me) to get rid of 
the stuck umount process.

Note that this is totally different from a "Z" state or zombie process, 
which might be what you were looking at.  Did your unkillable processes 
have a Z in the state column?  If so, they were actually really dead, 
but still holding a process table slot because they had not been 
"waited" for yet.  In Unix, when a process exits, it generally has some 
type of exit status to report (usually zero, but occasionally some 
other number, see /usr/include/sysexits.h for some examples).  It is 
the responsibility of the parent process to "wait" (by calling the wait 
system call) for these deceased processes and retrieve their exit 
status.  Normally, your shell does this for processes it spawns 
automatically, but occasionally the parent process fails to wait for 
it's children (sloppy coding, generally) and these "zombie" processes 
hang around in the process table.

So how do you get rid of zombies without rebooting?  ps -l will show 
you the parent process id of each process, and you can then hunt down 
the parent that is refusing to wait for it's children and kill it.  
Once a process has been killed that had children, those children are 
"inherited" by their parent's parent.  Ultimately, init (process number 
one and the parent or grandparent of all processes) has the responsibility
to wait for everything on the system.

I hope this helps some people understand a little about how child 
processes are reaped and what to do about the processes that just wont 
die.

        -jan-
-- 
Jan L. Peterson
Unemployed "Computer Facilitator"
http://www.peterson.ath.cx/~jlp/resume.html



____________________
BYU Unix Users Group 
http://uug.byu.edu/ 
___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

Reply via email to