Hi Gilles!
> this looks like a very different issue, orted cannot be remotely started.
> ...
>
> a better option (as long as you do not plan to relocate Open MPI install
> dir) is to configure with
>
> --enable-mpirun-prefix-by-default
>
Yes, that's was a problem with orted.
I checked PATH and L
Hi All !
As there are no any positive changes with "UDSM + IPoIB" problem since my
previous post,
we installed IPoIB on the cluster and "No OpenFabrics connection..." error
doesn't appear more.
But now OpenMPI reports about another problem:
In app ERROR OUTPUT stream:
[node2:14142] [[37935,0],0]
Hi Nathan!
UDCM does not require IPoIB. It should be working for you. Can you build
> Open MPI with --enable-debug and run with -mca btl_base_verbose 100 and
> create a gist with the output.
>
>
Ok, done:
https://gist.github.com/hsa-online/30bb27a90bb7b225b233cc2af11b3942
Best regards,
Sergei.
>
> I actually just filed a Github issue to ask this exact question:
>
> https://github.com/open-mpi/ompi/issues/2326
>
>
Good idea, thanks!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
> I haven't worked with InfiniBand for years, but I do believe that yes: you
> need IPoIB enabled on your IB devices to get the RDMA CM support to work.
>
>
Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports
that UD CM can't be used too.
Is it also require IPoIB?
Is it pos
Hi John !
I'm experimenting now with a head node and single compute node, all the
rest of cluster is switched off.
can you run :
>
> ibhosts
>
# ibhosts
Ca : 0x7cfe900300bddec0 ports 1 "MT25408 ConnectX Mellanox
Technologies"
Ca : 0xe41d2d030050caf0 ports 1 "MT25408 ConnectX Mellanox
T
Hi Jeff !
What does "ompi_info | grep openib" show?
>
>
$ ompi_info | grep openib
MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2)
Additionally, Mellanox provides alternate support through their MXM
> libraries, if you want to try that.
>
Yes, I know.
But we already h
Hi Gilles!
> is there any reason why you configure with --with-verbs-libdir=/usr/lib ?
> as far as i understand, --with-verbs should be enough, and /usr/lib
> nor /usr/local/lib should ever be used in the configure command line
> (and btw, are you running on a 32 bits system ? should the 64 bits
>
> Sorry - shoot down my idea. Over to someone else (me hides head in shame)
>
>
No problem, thanks for your try!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
> Sergei, what does the command "ibv_devinfo" return please?
>
> I had a recent case like this, but on Qlogic hardware.
> Sorry if I am mixing things up.
>
>
An output of ibv_devinfo from cluster's 1st node is:
$ ibv_devinfo -d mlx4_0
hca_id: mlx4_0
transport: Inf
Hello, All !
We have a problem with OpenMPI version 1.10.2 on a cluster with newly
installed Mellanox InfiniBand adapters.
OpenMPI was re-configured and re-compiled using: --with-verbs
--with-verbs-libdir=/usr/lib
And our test MPI task returns proper results but it seems OpenMPI continues
to use
11 matches
Mail list logo