On Feb 24, 2013, at 6:59 AM, Mike Dubman <mi...@dev.mellanox.co.il> wrote:
> What protection do you mean? Check that /proc/pid/status exists? It is done > in Grep() Ah, excellent -- I hadn't noticed that. > We observe that process which was launched by mtt and hangs (mtt detect > timeout and starts do_command procedure), later enters into "defunct" state. Looking at the code, you're checking for zombie status before MTT kills the proc. Am I reading that right? If so, then it could well be that the process has exited but not yet been reaped (because _kill_proc() hasn't been invoked yet). If this is the case, is the real cause of the problem that the OUTread and ERRread aren't being closed when the child process exits, and therefore we keep looping looking for new output from them? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/