2021 14:47
To: Broi, Franco via users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn
Indeed ...
I am not 100% sure the two errors are unrelated, but anyway,
That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the
versions in between.
It also passes
: 26 August 2021 14:47
To: Broi, Franco via users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn
Indeed ...
I am not 100% sure the two errors are unrelated, but anyway,
That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the
versions in between
Any chance of rpms for CentOS 8 for newer versions?
From: users on behalf of Gilles Gouaillardet
via users
Sent: 26 August 2021 14:47
To: Broi, Franco via users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn
Indeed ...
I am
Indeed ...
I am not 100% sure the two errors are unrelated, but anyway,
That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the
versions in between.
It also passes with the 4.1 and master branches
Bottom line, upgrade Open MPI to a latest version and you should be fine.
Thanks Gilles but no go...
/usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx /home/franco/spawn_example 47
I'm the parent on fsc07
Starting 47 children
Process 1 ([[48649,2],32]) is on host: fsc08
Process 2 ([[48649,1],0]) is on host: unknown!
BTLs attempted: vader tcp self
Your MPI job
Franco,
I am surprised UCX gets selected since there is no Infiniband network.
There used to be a bug that lead UCX to be selected on shm/tcp systems, but
it has been fixed. You might want to give a try to the latest versions of
Open MPI
(4.0.6 or 4.1.1)
Meanwhile, try to
mpirun --mca pml ^ucx
Hi,
I have 2 example progs that I found on the internet (attached) that illustrate
a problem we are having launching multiple node jobs with OpenMPI-4.0.5 and
MPI_spawn
CentOS Linux release 8.4.2105
openmpi-4.0.5-3.el8.x86_64
Slum 20.11.8
10Gbit ethernet network, no IB or other networks
I