Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Broi, Franco via users
2021 14:47 To: Broi, Franco via users Cc: Gilles Gouaillardet Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn Indeed ... I am not 100% sure the two errors are unrelated, but anyway, That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the versions in between. It also passes

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Broi, Franco via users
: 26 August 2021 14:47 To: Broi, Franco via users Cc: Gilles Gouaillardet Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn Indeed ... I am not 100% sure the two errors are unrelated, but anyway, That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the versions in between

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Broi, Franco via users
Any chance of rpms for CentOS 8 for newer versions? From: users on behalf of Gilles Gouaillardet via users Sent: 26 August 2021 14:47 To: Broi, Franco via users Cc: Gilles Gouaillardet Subject: Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn Indeed ... I am

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-26 Thread Gilles Gouaillardet via users
Indeed ... I am not 100% sure the two errors are unrelated, but anyway, That examples passes with Open MPI 4.0.1 and 4.0.6 and crashed with the versions in between. It also passes with the 4.1 and master branches Bottom line, upgrade Open MPI to a latest version and you should be fine.

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-25 Thread Broi, Franco via users
Thanks Gilles but no go... /usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx /home/franco/spawn_example 47 I'm the parent on fsc07 Starting 47 children Process 1 ([[48649,2],32]) is on host: fsc08 Process 2 ([[48649,1],0]) is on host: unknown! BTLs attempted: vader tcp self Your MPI job

Re: [OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-25 Thread Gilles Gouaillardet via users
Franco, I am surprised UCX gets selected since there is no Infiniband network. There used to be a bug that lead UCX to be selected on shm/tcp systems, but it has been fixed. You might want to give a try to the latest versions of Open MPI (4.0.6 or 4.1.1) Meanwhile, try to mpirun --mca pml ^ucx

[OMPI users] OpenMPI-4.0.5 and MPI_spawn

2021-08-25 Thread Broi, Franco via users
Hi, I have 2 example progs that I found on the internet (attached) that illustrate a problem we are having launching multiple node jobs with OpenMPI-4.0.5 and MPI_spawn CentOS Linux release 8.4.2105 openmpi-4.0.5-3.el8.x86_64 Slum 20.11.8 10Gbit ethernet network, no IB or other networks I