On 02/27/2017 05:19 PM, Howard Pritchard wrote:
> Hi Orion
> 
> Does the problem occur if you only use font2 and 3?  Do you have MXM installed
> on the font1 node?

No, running across font2/3 is fine.  No idea what MXM is.

> The 2.x series is using PMIX and it could be that is impacting the PML sanity
> check.
> 
> Howard
> 
> 
> Orion Poplawski <or...@cora.nwra.com <mailto:or...@cora.nwra.com>> schrieb am
> Mo. 27. Feb. 2017 um 14:50:
> 
>     We have a couple nodes with different IB adapters in them:
> 
>     font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies 
> MT25204
>     [InfiniHost III Lx HCA] [15b3:6274] (rev 20)
>     font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
> InfiniBand
>     HCA [1077:7220] (rev 02)
>     font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
> InfiniBand
>     HCA [1077:7220] (rev 02)
> 
>     With 1.10.3 we saw the following errors with mpirun:
> 
>     [font2.cora.nwra.com:13982 <http://font2.cora.nwra.com:13982>]
>     [[23220,1],10] selected pml cm, but peer
>     [[23220,1],0] on font1 selected pml ob1
> 
>     which crashed MPI_Init.
> 
>     We worked around this by passing "--mca pml ob1".  I notice now with 
> openmpi
>     2.0.2 without that option I no longer see errors, but the mpi program will
>     hang shortly after startup.  Re-adding the option makes it work, so I'm
>     assuming the underlying problem is still the same, but openmpi appears to 
> have
>     stopped alerting me to the issue.
> 
>     Thoughts?
> 
>     --
>     Orion Poplawski
>     Technical Manager                          720-772-5637
>     NWRA, Boulder/CoRA Office             FAX: 303-415-9702
>     3380 Mitchell Lane                       or...@nwra.com
>     <mailto:or...@nwra.com>
>     Boulder, CO 80301                   http://www.nwra.com
>     _______________________________________________
>     users mailing list
>     users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>     https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 


-- 
Orion Poplawski
Technical Manager                          720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       or...@nwra.com
Boulder, CO 80301                   http://www.nwra.com
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to