On 02/27/2017 05:19 PM, Howard Pritchard wrote: > Hi Orion > > Does the problem occur if you only use font2 and 3? Do you have MXM installed > on the font1 node?
No, running across font2/3 is fine. No idea what MXM is. > The 2.x series is using PMIX and it could be that is impacting the PML sanity > check. > > Howard > > > Orion Poplawski <or...@cora.nwra.com <mailto:or...@cora.nwra.com>> schrieb am > Mo. 27. Feb. 2017 um 14:50: > > We have a couple nodes with different IB adapters in them: > > font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies > MT25204 > [InfiniHost III Lx HCA] [15b3:6274] (rev 20) > font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 > InfiniBand > HCA [1077:7220] (rev 02) > font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 > InfiniBand > HCA [1077:7220] (rev 02) > > With 1.10.3 we saw the following errors with mpirun: > > [font2.cora.nwra.com:13982 <http://font2.cora.nwra.com:13982>] > [[23220,1],10] selected pml cm, but peer > [[23220,1],0] on font1 selected pml ob1 > > which crashed MPI_Init. > > We worked around this by passing "--mca pml ob1". I notice now with > openmpi > 2.0.2 without that option I no longer see errors, but the mpi program will > hang shortly after startup. Re-adding the option makes it work, so I'm > assuming the underlying problem is still the same, but openmpi appears to > have > stopped alerting me to the issue. > > Thoughts? > > -- > Orion Poplawski > Technical Manager 720-772-5637 > NWRA, Boulder/CoRA Office FAX: 303-415-9702 > 3380 Mitchell Lane or...@nwra.com > <mailto:or...@nwra.com> > Boulder, CO 80301 http://www.nwra.com > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > -- Orion Poplawski Technical Manager 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane or...@nwra.com Boulder, CO 80301 http://www.nwra.com _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users