Re: [OMPI users] ucx problems

2022-08-31 Thread Jeff Squyres (jsquyres) via users
(USA) via users Sent: Thursday, August 25, 2022 12:27 PM To: Tim Carlson Cc: Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) ; Open MPI Users Subject: Re: [OMPI users] ucx problems Yeah, that appears to have been the issue - IB is entirely dead (it's a new machine, so maybe no sub

Re: [OMPI users] ucx problems

2022-08-25 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Yeah, that appears to have been the issue - IB is entirely dead (it's a new machine, so maybe no subnet manager, or maybe a bad cable). I'll track that down, and follow up here if there's still an issue once the low level IB problem is fixed. However, given that ucx says it supports shared memo

Re: [OMPI users] ucx problems

2022-08-24 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Here is more information with higher verbosity: > mpirun -np 2 --mca pml ucx --mca osc ucx --bind-to core --map-by core > --rank-by core --mca pml_ucx_verbose 100 --mca osx_ucxv_erbose 100 --mca > bml_base_verbose 100 mpi_executable [tin2:1137672] mca: base: components_register: registering fram

[OMPI users] ucx problems

2022-08-24 Thread Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) via users
Hi all - I'm trying to get openmpi with ucx working on a new Rocky Linux 8 + OpenHPC machine. I'm used to running with mpirun --mca pml ucx --mca osc ucx --mca btl ^vader,tcp,openib --bind-to core --map-by core --rank-by core However, now it complains that it can't start the pml, with the message