Ah crud - I see what's going on. This is an issue of a message coming in on one 
interface that needs to get transferred to another one for relay. Looks like 
that mechanism is broken, which is causing us to issue another show_help, which 
gets caught in the same loop again.

I'll work on it - may take a day or two to really fix. Only impacts systems 
with mismatched interfaces, which is why we aren't generally seeing it.


On Jun 3, 2014, at 9:31 PM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> 
wrote:

> Ralph,
> 
> the application still hangs, i attached new logs.
> 
> on slurm0, if i /sbin/ifconfig eth0:1 down
> then the application does not hang any more
> 
> Cheers,
> 
> Gilles
> 
> 
> On Wed, Jun 4, 2014 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote:
> I appear to have this fixed now - please give the current trunk (r31949 or 
> above) a spin to see if I got it for you too.
> 
> 
> 
> <abort.oob.2.log.gz>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/06/14969.php

Reply via email to