Meh. Ok. Should George run with some verbose level to get more info? > On Jun 4, 2016, at 6:43 AM, Ralph Castain <[email protected]> wrote: > > Neither of those threads have anything to do with catching the sigchld - > threads 4-5 are listening for OOB and PMIx connection requests. It looks more > like mpirun thought it had picked everything up and has begun shutting down, > but I can’t really tell for certain. > >> On Jun 4, 2016, at 6:29 AM, Jeff Squyres (jsquyres) <[email protected]> >> wrote: >> >> On Jun 3, 2016, at 11:07 PM, George Bosilca <[email protected]> wrote: >>> >>> After finalize. As I said in my original email I se all the output the >>> application is generating, and all processes (which are local as this >>> happens on my laptop) are in zombie mode (Z+). This basically means whoever >>> was supposed to get the SIGCHLD, didn't do it's job of cleaning them up. >> >> Ah -- so perhaps threads 1,2,3 are red herrings: the real problem here is >> that the parent didn't catch the child exits (which presumably should have >> been caught in threads 4 or 5). >> >> Ralph: is there any state from threads 4 or 5 that would be helpful to >> examine to see if they somehow missed catching children exits? >> >> -- >> Jeff Squyres >> [email protected] >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> _______________________________________________ >> devel mailing list >> [email protected] >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/06/19070.php > > _______________________________________________ > devel mailing list > [email protected] > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/06/19071.php
-- Jeff Squyres [email protected] For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
