On Wed, 28 Sep 2011 10:18:55 +0200 Rob van der Heij wrote: > Including the "D" in the count as "competing for CPU resources" is > motivated by the assumption they will soon be "R" again.
Which is all fine *if* everyone plays by the "rules". Short-lived "D" is fine - say from an interrupt handler like it was (presumably) designed for. Unfortunately it got loose in the developer world. If code marks a thread as uninterruptible sleep, it is exactly that. It accepts no signal *at all*. And unlike zombies, they never get reaped if the mother task goes away. Imagine a sysadmin rings and says his Bozo AG system (which run all of Prod) has just stopped and issued a message saying the loadavg is above (say) 250% of the number of engines. Running tasks - 2; engines 4; loadavg 11; CPU% ~0 !!!. Ummmm. Add a couple of virtual CPs gets the ratio down but what do you do when in 2 hours time the loadavg is 20, then later 100 ?. This is just a terrible metric to base decisions on. A re-boot is required to clear a situation like the above - and it happens. Shane ... ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/
