Ah crud - I see what's going on. This is an issue of a message coming in on one interface that needs to get transferred to another one for relay. Looks like that mechanism is broken, which is causing us to issue another show_help, which gets caught in the same loop again.
I'll work on it - may take a day or two to really fix. Only impacts systems with mismatched interfaces, which is why we aren't generally seeing it. On Jun 3, 2014, at 9:31 PM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: > Ralph, > > the application still hangs, i attached new logs. > > on slurm0, if i /sbin/ifconfig eth0:1 down > then the application does not hang any more > > Cheers, > > Gilles > > > On Wed, Jun 4, 2014 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote: > I appear to have this fixed now - please give the current trunk (r31949 or > above) a spin to see if I got it for you too. > > > > <abort.oob.2.log.gz>_______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14969.php