Re: Zombie processes

2010-01-03 Thread Shachar Shemesh

sammy ominsky wrote:

Hi all,

I have one server that is constantly getting overrun by zombies!  Nagios alerts me that 


** NAGIOS ALERT ** PROBLEM with Zombie Processes on Hardware *** 
(***.***.***.***).  Service is CRITICAL as of Sun Jan 3 15:17:10 UTC 2010.  The 
additional information available is: PROCS CRITICAL: 23 processes with STATE = Z

ps shows me it's mostly one process this time, other times it's others

19279 ?Z  0:00 [playrecording.p] defunct
  
Use pstree and check who the zombies parent is. If it is the same 
process for almost all of them, this is likely a software bug in 
playrecording.p (or whatever the parent is). If it is process ID 1, then 
you have some other problem (probably in the kernel).


Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Zombie processes

2010-01-03 Thread Raz
look for open descriptors with lsof.

2010/1/3 Shachar Shemesh shac...@shemesh.biz:
 sammy ominsky wrote:

 Hi all,

 I have one server that is constantly getting overrun by zombies!  Nagios
 alerts me that

 ** NAGIOS ALERT ** PROBLEM with Zombie Processes on Hardware ***
 (***.***.***.***).  Service is CRITICAL as of Sun Jan 3 15:17:10 UTC 2010.
 The additional information available is: PROCS CRITICAL: 23 processes with
 STATE = Z

 ps shows me it's mostly one process this time, other times it's others

 19279 ?Z  0:00 [playrecording.p] defunct


 Use pstree and check who the zombies parent is. If it is the same process
 for almost all of them, this is likely a software bug in playrecording.p (or
 whatever the parent is). If it is process ID 1, then you have some other
 problem (probably in the kernel).

 Shachar

 --
 Shachar Shemesh
 Lingnu Open Source Consulting Ltd.
 http://www.lingnu.com

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il



___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Zombie processes

2010-01-03 Thread sammy ominsky
On 03/01/2010, at 18:22, Raz wrote:

 look for open descriptors with lsof.

Thanks!  I've pretty much got it pegged as a problem with playrecording.php, 
but I haven't found the reason yet.  Going to assign it to one of my staff 
coders to investigate.  The sysadmins were sadly clueless :)

--sambo
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Zombie processes

2010-01-03 Thread guy keren

sammy ominsky wrote:

On 03/01/2010, at 18:22, Raz wrote:


look for open descriptors with lsof.


Thanks!  I've pretty much got it pegged as a problem with playrecording.php, 
but I haven't found the reason yet.  Going to assign it to one of my staff 
coders to investigate.  The sysadmins were sadly clueless :)


sys admins who are not programmers have a very small chance of analyzing 
such a problem - because this is a software (bug) problem, not a system 
administration problem. don't blame them for not being able to do 
something that is completely not within their profession.


application programmers often do not understand these kind of bugs, 
because they are not systems programmers - they understand the 
application, but not the small intricacies of the unix programming model.


you need a systems programmer to analyze such bugs.

--guy

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il