Thanks Gilles but no go...
/usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx /home/franco/spawn_example 47
I'm the parent on fsc07
Starting 47 children
Process 1 ([[48649,2],32]) is on host: fsc08
Process 2 ([[48649,1],0]) is on host: unknown!
BTLs attempted: vader tcp self
Your MPI job
Franco,
I am surprised UCX gets selected since there is no Infiniband network.
There used to be a bug that lead UCX to be selected on shm/tcp systems, but
it has been fixed. You might want to give a try to the latest versions of
Open MPI
(4.0.6 or 4.1.1)
Meanwhile, try to
mpirun --mca pml ^ucx
Hi,
I have 2 example progs that I found on the internet (attached) that illustrate
a problem we are having launching multiple node jobs with OpenMPI-4.0.5 and
MPI_spawn
CentOS Linux release 8.4.2105
openmpi-4.0.5-3.el8.x86_64
Slum 20.11.8
10Gbit ethernet network, no IB or other networks
I