Ralph, at first glance, these errors look unrelated to PMIx. I noticed a bunch of bind() failure. based on your command line, I guess you are not running your job via a batch manager, and I would guess not all unix sockets are always cleaned up. (or this is an old bug and you did not manually clean your nodes when it was fixed)
the neighbor_allgather_self failure is discussed at https://github.com/open-mpi/ompi/pull/790 I will have a look at the op related failure on Monday (looks like a MPI conformance issue unrelated to PMIx) Cheers, Gilles On Saturday, September 12, 2015, Ralph Castain <r...@open-mpi.org> wrote: > Hi folks > > I’ve closed all the holes I can find in the PMIx integration, and things > look pretty good overall. There are a handful of failures still being seen > - most of them involving what appear to be unrelated code. I’m not entirely > sure I understand the source of the errors, and could really use some help > to determine (a) if these are in any way related to PMIx, and if so (b) how. > > The errors from my MTT run are here: > http://mtt.open-mpi.org/index.php?do_redir=2256 > > Any help diagnosing these problems would be greatly appreciated > Ralph > >