I noticed that the new release of orte is not as good as it used to be to cleanup the mess left by crashed/aborted mpi processes. Recently We have been experiencing a lot of zombie or live locked processes running on the cluster nodes and disturbing following experiments. I didn't really had time to investigate the issue, maybe ralph can set a ticket if he is able to reproduce this.

Aurelien
--
* Dr. Aurélien Bouteiller
* Sr. Research Associate at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 350
* Knoxville, TN 37996
* 865 974 6321





Reply via email to