Ubuntu or RHEL will be the same, but you definitely need Mellanox OFED. You 
need the verbs libraries and the tools (like mlnx_qos) to enable the RoCE 
configuration on the NICs and that only gets distributed with OFED.

UCX gets distributed as part of OFED but you can also install it yourself

We have enabled UCX on every OpenMPI since 1.10. We currently run a mix of 
3.1.4 and 4.1.1 on our HPC.

Sean
________________________________
From: Harutyun Umrshatyan <harutyun...@grovf.com>
Sent: Sunday, 4 September 2022 16:22
To: Sean Crosby <scro...@unimelb.edu.au>
Cc: users@lists.open-mpi.org <users@lists.open-mpi.org>
Subject: Re: [EXT] [OMPI users] MPI with RoCE

External email: Please exercise caution

________________________________
Dear Sean,

You gave me a lot of info!
Now I am going to set up RHEL7 with Mellanox OFED to test it. Before I had 
setup without Mlx OFED on Ubuntu. Do you think it might cause issues ?
Also please let me know the ompi version you have used and do I understand it 
right that ucx is intalled and configured separately, then openmpi is 
configured to used it?

Thanks again for your help!
Harutyun

On Sun, Sep 4, 2022, 09:20 Sean Crosby 
<scro...@unimelb.edu.au<mailto:scro...@unimelb.edu.au>> wrote:
Hi Harutyun,

We use RoCE v2 using OpenMPI on our cluster, and it works great. We used to use 
the openib BTL, but have moved competely across to UCX.

You have to configure RoCE on your switches and NICs (we use a mixture of 
Mellanox CX-4, CX-5 and CX-6 NICs, with Mellanox switches running Cumulus). We 
use DSCP and priority 3 for RoCE traffic tagging, and all our nodes run 
Mellanox OFED on RHEL7.

Once RoCE is configured and tested (using things like ib_send_bw -d mlx5_bond_0 
-x 7 -R -T 106  -D 10), getting UCX to use RoCE is quite easy, and compiling 
OpenMPI to use UCX is also very easy.

Sean
________________________________
From: users 
<users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> on 
behalf of Harutyun Umrshatyan via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Sent: Sunday, 4 September 2022 04:28
To: users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: Harutyun Umrshatyan <harutyun...@grovf.com<mailto:harutyun...@grovf.com>>
Subject: [EXT] [OMPI users] MPI with RoCE

External email: Please exercise caution

________________________________
Hi everyone

Could someone please share any experience using MPI with RoCE ?
I am trying to set up infiniband adapters (Mellanox cards for example) and run 
MPI applications with RoCE (Instead of TCP).
As I understand, there might be some environment requirements or restrictions 
like kernel version, installed drivers, etc.
I have tried a lot of versions of mpi libs and could not succeed. Would highly 
appreciate any hint or experience shared.

Best regards,
Harutyun Umrshatyan

Reply via email to