killing uninterruptible sleep process

2002-07-22 Thread Sagi Bashari

Hi

I logged in to one of my servers today and noticed that the load average is
very high. After investigating the problem I found two processes in D stat
(uninterruptible sleep):

root 24344  0.0  0.2  1532  584 ?DN   Jul21   0:01
/usr/bin/updatedb -f NFS,SMBFS,NCPFS,PROC,DEVPTS -e
/tmp,/var/tmp,/usr/tmp,/afs,/net
root 25899  0.0  0.2  1532  584 ?DN   06:26   0:01
/usr/bin/updatedb -f NFS,SMBFS,NCPFS,PROC,DEVPTS -e
/tmp,/var/tmp,/usr/tmp,/afs,/net


Their parent - pid 1532 is already dead.

My guess is that SMB share that I had mounted on this box have died after
the host rebooted. I cannot unmount that share because umount says that the
device is busy.

Is there any way to kill the dead updatedb processes and umount the dead
shares without rebooting the system?

Sagi




=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]




Re: killing uninterruptible sleep process

2002-07-22 Thread guy keren


On Mon, 22 Jul 2002, Sagi Bashari wrote:

 I logged in to one of my servers today and noticed that the load average is
 very high. After investigating the problem I found two processes in D stat
 (uninterruptible sleep):

 root   24344  0.0  0.2  1532  584 ?DN   Jul21   0:01
 /usr/bin/updatedb -f NFS,SMBFS,NCPFS,PROC,DEVPTS -e
 /tmp,/var/tmp,/usr/tmp,/afs,/net
 root   25899  0.0  0.2  1532  584 ?DN   06:26   0:01
 /usr/bin/updatedb -f NFS,SMBFS,NCPFS,PROC,DEVPTS -e
 /tmp,/var/tmp,/usr/tmp,/afs,/net


 Their parent- pid 1532 is already dead.

 My guess is that SMB share that I had mounted on this box have died after
 the host rebooted. I cannot unmount that share because umount says that the
 device is busy.

 Is there any way to kill the dead updatedb processesand umount the dead
 shares without rebooting the system?

no. that's why its called 'uninterruptible sleep'. 'kill -9' wn't help
either, in such cases - they'll go away only when you reboot the mchine
(and that reobot could get stuck while trying to umount the file systems,
forcing you to reset the machine). this is an annoying problem, indeed.

-- 
guy

For world domination - press 1,
 or dial 0, and please hold, for the creator. -- nob o. dy


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]




Re: Why do we need uninterruptible sleep? (was: Re: killing uninterruptible sleep process)

2002-07-22 Thread Nadav Har'El

On Mon, Jul 22, 2002, Omer Zak wrote about Why do we need uninterruptible sleep? 
(was: Re: killing uninterruptible sleep process):
 Why then there is such a thing as uninterruptible sleep in Unix/Linux?
 What useful purpose (besides full Posix conformance) does this serve?

As far as I know, processes are put into uninterruptible sleep (D) state when
it is either too complicated, or the programmer was too lazy, to decide what
to do in the case that the process would be killed during the sleep.

The quintessential case is a process that is waiting for a disk page to be
copied to its memory, or for one of its virtual-memory pages to be swapped
into core from disk. What would (or should) happen if a process will be
killed at this point? When the disk hardware is finally ready to fetch the
page, the process is no longer there and its memory pages have gone the way
of the dodo. It would be simpler if the process simply could not be killed
at this point.

However, A good kernel design should only use the D state sparingly, and
only for very short term operations that are sure to succeed (such as
fetching a page from disk). A bad design, on the other hand, would use it
in many places where a process being killed might complicate the programmer's
life, such as when waiting for RPC replies (such as in NFS) or the SMB stuff
you may have noticed. In my opinion, all these cases which might result in
processes being stuck in the D state for a long time, should be considered
bugs in the kernel and should be fixed.

-- 
Nadav Har'El|  Monday, Jul 22 2002, 13 Av 5762
[EMAIL PROTECTED] |-
Phone: +972-53-245868, ICQ 13349191 |Entropy: Not just a fad, it's the future!
http://nadav.harel.org.il   |

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]