Re: [OMPI users] OpenMPI + InfiniBand

2016-12-26 Thread gilles
Sergei, thanks for confirming you are now able to use Open MPI fwiw, orted is remotely started by the selected plm component. it can be ssh if you run without a batch manager, the tm interface if PBS/torque, srun if slurm, etc ... that should explain why exporting PATH and LD_LIBRARY_PATH is

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-26 Thread Sergei Hrushev
Hi Gilles! > this looks like a very different issue, orted cannot be remotely started. > ... > > a better option (as long as you do not plan to relocate Open MPI install > dir) is to configure with > > --enable-mpirun-prefix-by-default > Yes, that's was a problem with orted. I checked PATH and

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-23 Thread r...@open-mpi.org
Also check to ensure you are using the same version of OMPI on all nodes - this message usually means that a different version was used on at least one node. > On Dec 23, 2016, at 1:58 AM, gil...@rist.or.jp wrote: > > Serguei, > > > this looks like a very different issue, orted cannot be

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-23 Thread gilles
Serguei, this looks like a very different issue, orted cannot be remotely started. that typically occurs if orted cannot find some dependencies (the Open MPI libs and/or the compiler runtime) for example, from a node, ssh orted should not fail because of unresolved dependencies. a simple

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-22 Thread Sergei Hrushev
Hi All ! As there are no any positive changes with "UDSM + IPoIB" problem since my previous post, we installed IPoIB on the cluster and "No OpenFabrics connection..." error doesn't appear more. But now OpenMPI reports about another problem: In app ERROR OUTPUT stream: [node2:14142]

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-02 Thread Sergei Hrushev
Hi Nathan! UDCM does not require IPoIB. It should be working for you. Can you build > Open MPI with --enable-debug and run with -mca btl_base_verbose 100 and > create a gist with the output. > > Ok, done: https://gist.github.com/hsa-online/30bb27a90bb7b225b233cc2af11b3942 Best regards, Sergei.

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Nathan Hjelm
UDCM does not require IPoIB. It should be working for you. Can you build Open MPI with --enable-debug and run with -mca btl_base_verbose 100 and create a gist with the output. -Nathan On Nov 01, 2016, at 07:50 AM, Sergei Hrushev wrote: I haven't worked with InfiniBand

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
> > I actually just filed a Github issue to ask this exact question: > > https://github.com/open-mpi/ompi/issues/2326 > > Good idea, thanks! ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Jeff Squyres (jsquyres)
I actually just filed a Github issue to ask this exact question: https://github.com/open-mpi/ompi/issues/2326 > On Nov 1, 2016, at 9:49 AM, Sergei Hrushev wrote: > > > I haven't worked with InfiniBand for years, but I do believe that yes: you > need IPoIB enabled on

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
> > > I haven't worked with InfiniBand for years, but I do believe that yes: you > need IPoIB enabled on your IB devices to get the RDMA CM support to work. > > Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports that UD CM can't be used too. Is it also require IPoIB? Is it

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Jeff Squyres (jsquyres)
On Nov 1, 2016, at 2:40 AM, Sergei Hrushev wrote: > > Yes, I tried to get this info already. > And I saw in log that rdmacm wants IP address on port. > So my question in topc start message was: > Is it enough for OpenMPI to have RDMA only or IPoIB should also be > installed?

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi John ! I'm experimenting now with a head node and single compute node, all the rest of cluster is switched off. can you run : > > ibhosts > # ibhosts Ca : 0x7cfe900300bddec0 ports 1 "MT25408 ConnectX Mellanox Technologies" Ca : 0xe41d2d030050caf0 ports 1 "MT25408 ConnectX Mellanox

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread John Hearns via users
Segei, can you run : ibhosts ibstat ibdiagnet Lord help me for being so naive, but do you have a subnet manager running? On 1 November 2016 at 06:40, Sergei Hrushev wrote: > Hi Jeff ! > > What does "ompi_info | grep openib" show? >> >> > $ ompi_info | grep openib >

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi Jeff ! What does "ompi_info | grep openib" show? > > $ ompi_info | grep openib MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2) Additionally, Mellanox provides alternate support through their MXM > libraries, if you want to try that. > Yes, I know. But we already

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-31 Thread Jeff Squyres (jsquyres)
What does "ompi_info | grep openib" show? Additionally, Mellanox provides alternate support through their MXM libraries, if you want to try that. If that shows that you have the openib BTL plugin loaded, try running with "mpirun --mca btl_base_verbose 100 ..." That will provide additional

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
Hi Gilles! > is there any reason why you configure with --with-verbs-libdir=/usr/lib ? > as far as i understand, --with-verbs should be enough, and /usr/lib > nor /usr/local/lib should ever be used in the configure command line > (and btw, are you running on a 32 bits system ? should the 64 bits

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-30 Thread Sergei Hrushev
> > Sorry - shoot down my idea. Over to someone else (me hides head in shame) > > No problem, thanks for your try! ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Gilles Gouaillardet
Sergei, is there any reason why you configure with --with-verbs-libdir=/usr/lib ? as far as i understand, --with-verbs should be enough, and /usr/lib nor /usr/local/lib should ever be used in the configure command line (and btw, are you running on a 32 bits system ? should the 64 bits libs be in

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread John Hearns via users
Sorry - shoot down my idea. Over to someone else (me hides head in shame) On 28 October 2016 at 11:28, Sergei Hrushev wrote: > Sergei, what does the command "ibv_devinfo" return please? >> >> I had a recent case like this, but on Qlogic hardware. >> Sorry if I am mixing

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
> > Sergei, what does the command "ibv_devinfo" return please? > > I had a recent case like this, but on Qlogic hardware. > Sorry if I am mixing things up. > > An output of ibv_devinfo from cluster's 1st node is: $ ibv_devinfo -d mlx4_0 hca_id: mlx4_0 transport:

Re: [OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread John Hearns via users
Sergei, what does the command "ibv_devinfo" return please? I had a recent case like this, but on Qlogic hardware. Sorry if I am mixing things up. On 28 October 2016 at 10:48, Sergei Hrushev wrote: > Hello, All ! > > We have a problem with OpenMPI version 1.10.2 on a

[OMPI users] OpenMPI + InfiniBand

2016-10-28 Thread Sergei Hrushev
Hello, All ! We have a problem with OpenMPI version 1.10.2 on a cluster with newly installed Mellanox InfiniBand adapters. OpenMPI was re-configured and re-compiled using: --with-verbs --with-verbs-libdir=/usr/lib And our test MPI task returns proper results but it seems OpenMPI continues to use

Re: [OMPI users] openmpi+infiniband

2013-07-31 Thread christian schmitt
Sorry for this. This was an try and ERROR ERROR Problem. It was a mismatch of OFED versions and kernel updates. Now I installed a fresh centOS 6.4 (with default kernel NO KENELUPDATE). Then installed the official MELLANOX OFED Driver and compiled openMPI (without options). And now it works fine.

Re: [OMPI users] openmpi+infiniband

2013-07-30 Thread christian schmitt
Hallo, Thank you for this. When I start the mpi test with the option "--mca btl openib,sm,self" I can start it on on node. But I can't start it on two nodes. The Error then is: schmitt$ /amd/software/openmpi-1.6.5/cltest/bin/mpirun -n 2 -H cluster1,cluster2

Re: [OMPI users] openmpi+infiniband

2013-07-30 Thread Reuti
Am 30.07.2013 um 15:01 schrieb christian schmitt: > I´m trying to get openmpi(1.6.5) running with/over infiniband. > My system is a centOS 6.3. I have installed the Mellanox OFED driver > (2.0) and everything seems working. ibhosts shows all hosts and the switch. > A "hca_self_test.ofed" shows: >

Re: [OMPI users] openmpi+infiniband

2013-07-30 Thread Gus Correa
Hi Christian If I understand you right, you want to use Open MPI with Infiniband, not Ethernet, right? If that is the case, try '-mca btl openib,sm,self' in your mpiexec command line. I don't think ipoib is required for Open MPI. See these FAQ (FAQ is the best OpenMPI documentation):

[OMPI users] openmpi+infiniband

2013-07-30 Thread christian schmitt
Hallo, I´m trying to get openmpi(1.6.5) running with/over infiniband. My system is a centOS 6.3. I have installed the Mellanox OFED driver (2.0) and everything seems working. ibhosts shows all hosts and the switch. A "hca_self_test.ofed" shows: Performing Adapter Device Self Test

Re: [OMPI users] [openMPI-infiniband] openMPI in IB network when openSM with LASH is running

2007-11-29 Thread Jeff Squyres
On Nov 29, 2007, at 12:08 AM, Keshetti Mahesh wrote: There is work starting literally right about now to allow Open MPI to use the RDMA CM and/or the IBCM for creating OpenFabrics connections (IB or iWARP). when this is expected to be completed? It will not planned to be released until the

Re: [OMPI users] [openMPI-infiniband] openMPI in IB network when openSM with LASH is running

2007-11-29 Thread Keshetti Mahesh
> There is work starting literally right about now to allow Open MPI to > use the RDMA CM and/or the IBCM for creating OpenFabrics connections > (IB or iWARP). when this is expected to be completed? -Mahesh

Re: [OMPI users] [openMPI-infiniband] openMPI in IB network when openSM with LASH is running

2007-11-28 Thread Jeff Squyres
There is work starting literally right about now to allow Open MPI to use the RDMA CM and/or the IBCM for creating OpenFabrics connections (IB or iWARP). On Nov 28, 2007, at 4:37 AM, Keshetti Mahesh wrote: Has anyone in the list ever tested openMPI in infiniband network in which openSM is

[OMPI users] [openMPI-infiniband] openMPI in IB network when openSM with LASH is running

2007-11-28 Thread Keshetti Mahesh
Has anyone in the list ever tested openMPI in infiniband network in which openSM is running with LASH routing algorithm enabled? I haven't tested the above case but i could foresee a problem because LASH routing algorithm in openSM uses virtual lanes (VL) which are directly mapped with service