Re: [OMPI users] OpenMPI + InfiniBand

2016-12-26 Thread Sergei Hrushev
Hi Gilles! > this looks like a very different issue, orted cannot be remotely started. > ... > > a better option (as long as you do not plan to relocate Open MPI install > dir) is to configure with > > --enable-mpirun-prefix-by-default > Yes, that's was a problem with orted. I checked PATH and L

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-22 Thread Sergei Hrushev
Hi All ! As there are no any positive changes with "UDSM + IPoIB" problem since my previous post, we installed IPoIB on the cluster and "No OpenFabrics connection..." error doesn't appear more. But now OpenMPI reports about another problem: In app ERROR OUTPUT stream: [node2:14142] [[37935,0],0]

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-02 Thread Sergei Hrushev
Hi Nathan! UDCM does not require IPoIB. It should be working for you. Can you build > Open MPI with --enable-debug and run with -mca btl_base_verbose 100 and > create a gist with the output. > > Ok, done: https://gist.github.com/hsa-online/30bb27a90bb7b225b233cc2af11b3942 Best regards, Sergei.

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
> > I actually just filed a Github issue to ask this exact question: > > https://github.com/open-mpi/ompi/issues/2326 > > Good idea, thanks! ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
> > > I haven't worked with InfiniBand for years, but I do believe that yes: you > need IPoIB enabled on your IB devices to get the RDMA CM support to work. > > Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports that UD CM can't be used too. Is it also require IPoIB? Is it pos

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi John ! I'm experimenting now with a head node and single compute node, all the rest of cluster is switched off. can you run : > > ibhosts > # ibhosts Ca : 0x7cfe900300bddec0 ports 1 "MT25408 ConnectX Mellanox Technologies" Ca : 0xe41d2d030050caf0 ports 1 "MT25408 ConnectX Mellanox T

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-31 Thread Sergei Hrushev
Hi Jeff ! What does "ompi_info | grep openib" show? > > $ ompi_info | grep openib MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2) Additionally, Mellanox provides alternate support through their MXM > libraries, if you want to try that. > Yes, I know. But we already h

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
Hi Gilles! > is there any reason why you configure with --with-verbs-libdir=/usr/lib ? > as far as i understand, --with-verbs should be enough, and /usr/lib > nor /usr/local/lib should ever be used in the configure command line > (and btw, are you running on a 32 bits system ? should the 64 bits

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
> > Sorry - shoot down my idea. Over to someone else (me hides head in shame) > > No problem, thanks for your try! ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
> > Sergei, what does the command "ibv_devinfo" return please? > > I had a recent case like this, but on Qlogic hardware. > Sorry if I am mixing things up. > > An output of ibv_devinfo from cluster's 1st node is: $ ibv_devinfo -d mlx4_0 hca_id: mlx4_0 transport: Inf

[OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
Hello, All ! We have a problem with OpenMPI version 1.10.2 on a cluster with newly installed Mellanox InfiniBand adapters. OpenMPI was re-configured and re-compiled using: --with-verbs --with-verbs-libdir=/usr/lib And our test MPI task returns proper results but it seems OpenMPI continues to use