Re: [MTT devel] fix zombie commit

2013-02-26 Thread Jeff Squyres (jsquyres)
On Feb 26, 2013, at 2:11 AM, Mike Dubman wrote: > On Mon, Feb 25, 2013 at 6:24 PM, Jeff Squyres (jsquyres) > wrote: > >Looking at the code, you're checking for zombie status before MTT kills the > >proc. Am I reading that right? > I don`t think the order matters, if process is not Zombie yet

Re: [MTT devel] fix zombie commit

2013-02-26 Thread Mike Dubman
On Mon, Feb 25, 2013 at 6:24 PM, Jeff Squyres (jsquyres) wrote: > >Looking at the code, you're checking for zombie status before MTT kills > the proc. Am I reading that right? > I don`t think the order matters, if process is not Zombie yet and about to be killed by MTT later - it is a good flow.

Re: [MTT devel] fix zombie commit

2013-02-25 Thread Jeff Squyres (jsquyres)
On Feb 24, 2013, at 6:59 AM, Mike Dubman wrote: > What protection do you mean? Check that /proc/pid/status exists? It is done > in Grep() Ah, excellent -- I hadn't noticed that. > We observe that process which was launched by mtt and hangs (mtt detect > timeout and starts do_command procedure

Re: [MTT devel] fix zombie commit

2013-02-24 Thread Mike Dubman
Hi Jeff, What protection do you mean? Check that /proc/pid/status exists? It is done in Grep() We observe that process which was launched by mtt and hangs (mtt detect timeout and starts do_command procedure), later enters into "defunct" state. The mtt sends email that process hangs and when

[MTT devel] fix zombie commit

2013-02-24 Thread Jeff Squyres (jsquyres)
Mike -- Please protect this code better; MTT is also run on Solaris and OS X. Also, can you describe more fully the case where zombies are being left behind by MTT? On Feb 24, 2013, at 1:44 AM, wrote: > Author: miked (Mike Dubman) > Date: 2013-02-24 01:44:31 EST (Sun, 24 Feb 2013) > New Revi