Additionally, if you run

  ompi_info | grep psm

Do you see the PSM MTL listed?

To force the CM MTL, you can run:

  mpirun --mca pml cm ...

That won't let any BTLs be selected (because only ob1 uses the BTLs).


> On Mar 17, 2016, at 8:07 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> can you try to add
> --mca mtl psm
> to your mpirun command line ?
> 
> you might also have to blacklist the opening btl
> 
> Cheers,
> 
> Gilles
> 
> On Thursday, March 17, 2016, dpchoudh . <dpcho...@gmail.com> wrote:
> Hello all
> I have a simple test setup, consisting of two Dell workstation nodes with 
> similar hardware profile.
> 
> Both the nodes have (identical)
> 1. Qlogic 4x DDR infiniband
> 2. Chelsio C310 iWARP ethernet.
> 
> Both of these cards are connected back to back, without a switch.
> 
> With this setup, I can run OpenMPI over TCP and openib BTL. However, if I try 
> to use the PSM MTL (excluding the Chelsio NIC, of course, since it does not 
> support PSM), I get an error from one of the nodes (details below), which 
> makes me think that a required library or package is not installed, but I 
> can't figure out what it might be.
> 
> Note that the test program is a simple 'hello world' program.
> 
> The following work:
>   mpirun -np 2 --hostfile ~/hostfile -mca btl tcp,self ./mpitest
> mpirun -np 2 --hostfile ~/hostfile -mca btl self,openib -mca 
> btl_openib_if_exclude cxgb3_0 ./mpitest
> 
> (I had to exclude the Chelsio card because of this issue:
> https://www.open-mpi.org/community/lists/users/2016/03/28661.php  )
> 
> Here is what does NOT work:
> mpirun -np 2 --hostfile ~/hostfile -mca mtl psm -mca btl_openib_if_exclude 
> cxgb3_0 ./mpitest
> 
> The error (from both nodes) is: 
>  mca: base: components_open: component pml / cm open function failed
> 
> However, I still see the "Hello, world" output indicating that the program 
> ran to completion.
> 
> Here is also another command that does NOT work:
> 
> mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca btl_openib_if_exclude 
> cxgb3_0 ./mpitest
> 
> The error is: (from the root node)
> PML cm cannot be selected
> 
> However, this time, I see no output from the program, indicating it did not 
> run.
> 
> The following command also fails in a similar way:
>  mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca mtl psm -mca 
> btl_openib_if_exclude cxgb3_0 ./mpitest
> 
> I have verified that infinipath-psm is installed on both nodes. Both nodes 
> run identical CentOS 7 and the libraries were installed from the CentOS 
> repositories (i.e. were not compiled from source)
> 
> Both nodes run OMPI 1.10.2, compiled from the source RPM.
> 
> What am I doing wrong?
> 
> Thanks
> Durga
> 
> 
> 
> 
> Life is complex. It has real and imaginary parts.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/03/28725.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to