Ralph,

at first glance, these errors look unrelated to PMIx.
I noticed a bunch of bind() failure.
based on your command line, I guess you are not running your job via a
batch manager,
and I would guess not all unix sockets are always cleaned up.
(or this is an old bug and you did not manually clean your nodes when it
was fixed)

the neighbor_allgather_self failure is discussed at
https://github.com/open-mpi/ompi/pull/790

I will have a look at the op related failure on Monday
(looks like a MPI conformance issue unrelated to PMIx)

Cheers,

Gilles

On Saturday, September 12, 2015, Ralph Castain <r...@open-mpi.org> wrote:

> Hi folks
>
> I’ve closed all the holes I can find in the PMIx integration, and things
> look pretty good overall. There are a handful of failures still being seen
> - most of them involving what appear to be unrelated code. I’m not entirely
> sure I understand the source of the errors, and could really use some help
> to determine (a) if these are in any way related to PMIx, and if so (b) how.
>
> The errors from my MTT run are here:
> http://mtt.open-mpi.org/index.php?do_redir=2256
>
> Any help diagnosing these problems would be greatly appreciated
> Ralph
>
>

Reply via email to