Re: [Suspend-devel] Fwd: Kernel Oops with 2.6.23
On Mon 2007-12-17 21:45:54, Ritesh Raj Sarraf wrote: > On Tuesday 04 December 2007, Pavel Machek wrote: > > On Mon 2007-12-03 06:01:26, Ritesh Raj Sarraf wrote: > > > On Sunday 02 December 2007, Pavel Machek wrote: > > > > killall -9 pulseaudio. If pulseaudio is not dead within 60 seconds, > > > > you hit a kernel bug. If it needs suspend to be reproduced, you > > > > probably have a suspend bug. > > > > > > Hi Pavel, > > > > > > Something similar to this are multiple cases where the kernel is not able > > > to kill a process at all. > > > > > > A good example is an application pumping IO to a multipathed device. When > > > all the paths to the multipathed devices go down, and you'd like to kill > > > the process, there is no way left to do it. In fact, a reboot also > > > doesn't work in such cases. Reboot gets hung in midway trying to kill the > > > process. The user is left to do a hard reset of the machine. > > > > > > In situations like these, the processes go into D state. > > > > > > Here's what the manpage of ps says: > > > > > > PROCESS STATE CODES > > >Here are the different values that the s, stat and state output > > > specifiers (header "STAT" or "S") > > >will display to describe the state of a process. > > >DUninterruptible sleep (usually IO) > > > > > > Does it mean that processes in D state are excluded by the kernel from > > > being killed ? Or is it still a kernel bug ? > > > > Still a kernel bug. Processes should not stay in D state for long. > > Pavel > > Hi Pavel, > > Sometime back we discussed about 'D' state processes which are not killed by > the kernel by any signal. > > Here's a bugzilla detailing the symptom. > https://bugzilla.redhat.com/show_bug.cgi?id=419581 > [I/O Processes don't get killed when all the paths to the LUN are down] > > It is still being assumed as working as designed. This is borderline. I guess multipath should use new TASK_KILLABLE infrastructure, but I do not expect RedHat to backport that. Feel free to implement that yourself, it should be quite easy. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend-devel] Fwd: Kernel Oops with 2.6.23
On Mon 2007-12-17 21:45:54, Ritesh Raj Sarraf wrote: On Tuesday 04 December 2007, Pavel Machek wrote: On Mon 2007-12-03 06:01:26, Ritesh Raj Sarraf wrote: On Sunday 02 December 2007, Pavel Machek wrote: killall -9 pulseaudio. If pulseaudio is not dead within 60 seconds, you hit a kernel bug. If it needs suspend to be reproduced, you probably have a suspend bug. Hi Pavel, Something similar to this are multiple cases where the kernel is not able to kill a process at all. A good example is an application pumping IO to a multipathed device. When all the paths to the multipathed devices go down, and you'd like to kill the process, there is no way left to do it. In fact, a reboot also doesn't work in such cases. Reboot gets hung in midway trying to kill the process. The user is left to do a hard reset of the machine. In situations like these, the processes go into D state. Here's what the manpage of ps says: PROCESS STATE CODES Here are the different values that the s, stat and state output specifiers (header STAT or S) will display to describe the state of a process. DUninterruptible sleep (usually IO) Does it mean that processes in D state are excluded by the kernel from being killed ? Or is it still a kernel bug ? Still a kernel bug. Processes should not stay in D state for long. Pavel Hi Pavel, Sometime back we discussed about 'D' state processes which are not killed by the kernel by any signal. Here's a bugzilla detailing the symptom. https://bugzilla.redhat.com/show_bug.cgi?id=419581 [I/O Processes don't get killed when all the paths to the LUN are down] It is still being assumed as working as designed. This is borderline. I guess multipath should use new TASK_KILLABLE infrastructure, but I do not expect RedHat to backport that. Feel free to implement that yourself, it should be quite easy. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend-devel] Fwd: Kernel Oops with 2.6.23
On Tuesday 04 December 2007, Pavel Machek wrote: > On Mon 2007-12-03 06:01:26, Ritesh Raj Sarraf wrote: > > On Sunday 02 December 2007, Pavel Machek wrote: > > > killall -9 pulseaudio. If pulseaudio is not dead within 60 seconds, > > > you hit a kernel bug. If it needs suspend to be reproduced, you > > > probably have a suspend bug. > > > > Hi Pavel, > > > > Something similar to this are multiple cases where the kernel is not able > > to kill a process at all. > > > > A good example is an application pumping IO to a multipathed device. When > > all the paths to the multipathed devices go down, and you'd like to kill > > the process, there is no way left to do it. In fact, a reboot also > > doesn't work in such cases. Reboot gets hung in midway trying to kill the > > process. The user is left to do a hard reset of the machine. > > > > In situations like these, the processes go into D state. > > > > Here's what the manpage of ps says: > > > > PROCESS STATE CODES > >Here are the different values that the s, stat and state output > > specifiers (header "STAT" or "S") > >will display to describe the state of a process. > >DUninterruptible sleep (usually IO) > > > > Does it mean that processes in D state are excluded by the kernel from > > being killed ? Or is it still a kernel bug ? > > Still a kernel bug. Processes should not stay in D state for long. > Pavel Hi Pavel, Sometime back we discussed about 'D' state processes which are not killed by the kernel by any signal. Here's a bugzilla detailing the symptom. https://bugzilla.redhat.com/show_bug.cgi?id=419581 [I/O Processes don't get killed when all the paths to the LUN are down] It is still being assumed as working as designed. Ritesh -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com "Necessity is the mother of invention." signature.asc Description: This is a digitally signed message part.
Re: [Suspend-devel] Fwd: Kernel Oops with 2.6.23
On Tuesday 04 December 2007, Pavel Machek wrote: On Mon 2007-12-03 06:01:26, Ritesh Raj Sarraf wrote: On Sunday 02 December 2007, Pavel Machek wrote: killall -9 pulseaudio. If pulseaudio is not dead within 60 seconds, you hit a kernel bug. If it needs suspend to be reproduced, you probably have a suspend bug. Hi Pavel, Something similar to this are multiple cases where the kernel is not able to kill a process at all. A good example is an application pumping IO to a multipathed device. When all the paths to the multipathed devices go down, and you'd like to kill the process, there is no way left to do it. In fact, a reboot also doesn't work in such cases. Reboot gets hung in midway trying to kill the process. The user is left to do a hard reset of the machine. In situations like these, the processes go into D state. Here's what the manpage of ps says: PROCESS STATE CODES Here are the different values that the s, stat and state output specifiers (header STAT or S) will display to describe the state of a process. DUninterruptible sleep (usually IO) Does it mean that processes in D state are excluded by the kernel from being killed ? Or is it still a kernel bug ? Still a kernel bug. Processes should not stay in D state for long. Pavel Hi Pavel, Sometime back we discussed about 'D' state processes which are not killed by the kernel by any signal. Here's a bugzilla detailing the symptom. https://bugzilla.redhat.com/show_bug.cgi?id=419581 [I/O Processes don't get killed when all the paths to the LUN are down] It is still being assumed as working as designed. Ritesh -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com Necessity is the mother of invention. signature.asc Description: This is a digitally signed message part.