On Wed, 14 May 2008 13:30:25 +0100 Graeme Gregory <[EMAIL PROTECTED]> babbled:
> On Wed, May 14, 2008 at 10:12:00PM +1000, Carsten Haitzler wrote: > > On Wed, 14 May 2008 13:06:01 +0100 Andy Green <[EMAIL PROTECTED]> babbled: > > > > > -----BEGIN PGP SIGNED MESSAGE----- > > > Hash: SHA1 > > > > > > Somebody in the thread at some point said: > > > > > > | WNOHANG return immediately if no child has exited. > > > | > > > | so under no circumstances should this ever hang... but oooh. it does. > > > | now interestingly i attached to apm to see what it was doing.. and > > > lo-and-behold, > > > | it woke up and continued to execute then exited with sh reaping the > > > child then > > > | e reaping the sh and e waking up again: > > > > > > There's some process freezing step as part of entering suspend, I guess > > > it is to do with that. FWIW echo mem > /sys/power/state also the echo > > > never returns until it comes back in resume. > > > > sure - but the system never suspended - it stayed alive. that's why i could > > debug :) the problem is a sigchld has been issued for a process that hasn't > > fully exited and waitpid() is blocking even with WNOHANG. it should never > > block > > - ever. doesn't matter what the child is doing. :) never hang. ever! :) the > > problem is the freeze of the apm process propagates to all its parents - > > when i do know that ecore (the lib for e handling this) is carefully > > written to avoid such hangs. :) > > This sounds like the bug I found years ago, when more than one program > opens /dev/apm_bios when apm -s is called they all lock up until you > kill them all until only apmd is left. Then suddenly stuff starts > working again. > > This is the reason I originally wrote the > > org.openmoko.dev/packages/xorg-xserver/xserver-kdrive/disable-apm.patch > > To stop xserver causing this fault. shouldn't we look at fixing apm itself instead of working around it? :) also this is a pretty major issue here as this means suspend/resume is going to be very liable to hit this bug and thus have problems. the apm process itself getting hung in such a way all parent processes also get hung in waitpid() even if they call it with WNOHANG is going to hang... this is bad... :( -- Carsten Haitzler (The Rasterman) <[EMAIL PROTECTED]>
