Thanks for the hint on "mpirun ldd". I will try it. The problem is that I am 
running on the cloud and it is trickier to get into a node at run time, or save 
information to be retrieved later.

Sorry for my ignorance on mca stuff, but what would exactly be the suggested 
mpirun command line options on coll / tuned?

Cheers,

Ernesto.

From: users <users-boun...@lists.open-mpi.org> On Behalf Of Gilles Gouaillardet 
via users
Sent: Monday, March 14, 2022 2:22 AM
To: Open MPI Users <users@lists.open-mpi.org>
Cc: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
Subject: Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

Ernesto,

you can
mpirun ldd <your binary>

and double check it uses the library you expect.


you might want to try adapting your trick to use Open MPI 4.1.2 with your 
binary built with Open MPI 4.0.3 and see how it goes.
i'd try disabling coll/tuned first though.


Keep in mind PETSc might call MPI_Allreduce under the hood with matching but 
different signatures.


Cheers,

Gilles

On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Thanks, Gilles.

In the case of the application I am working on, all ranks call MPI with the 
same signature / types of variables.

I do not think there is a code error anywhere. I think this is "just" a 
configuration error from my part.

Regarding the idea of changing just one item at a time: that would be the next 
step, but first I would like to check if my suspicion that the presence of both 
"/opt/openmpi_4.0.3" and "/appl-third-parties/openmpi-4.1.2" at run time could 
be an issue:

  *   It is an issue on situation 2, when I explicitly point the runtime mpi to 
be 4.1.2 (also used in compilation)
  *   It is not an issue on situation 3, when I explicitly point the runtime 
mpi to be 4.0.3 compiled with INTEL (even though I compiled the application and 
openmpi 4.1.2 with GNU, and I link the application with openmpi 4.1.2)

Best,

Ernesto.

From: Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>>
Sent: Monday, March 14, 2022 1:37 AM
To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: Ernesto Prudencio <epruden...@slb.com<mailto:epruden...@slb.com>>
Subject: Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

Ernesto,

the coll/tuned module (that should handle collective subroutines by default) 
has a known issue when matching but non identical signatures are used:
for example, one rank uses one vector of n bytes, and an other rank uses n 
bytes.
Is there a chance your application might use this pattern?

You can give try disabling this component with
mpirun --mca coll ^tuned ...


I noted between the successful a) case and the unsuccessful b) case, you 
changed 3 parameters:
 - compiler vendor
 - Open MPI version
 - PETSc 3.10.4
so at this stage, it is not obvious which should be blamed for the failure.


In order to get a better picture, I would first try
 - Intel compilers
 - Open MPI 4.1.2
 - PETSc 3.10.4

=> a failure would suggest a regression in Open MPI

And then
 - Intel compilers
 - Open MPI 4.0.3
 - PETSc 3.16.5

=> a failure would either suggest a regression in PETSc, or PETSc doing 
something different but legit that evidences a bug in Open MPI.

If you have time, you can also try
 - Intel compilers
 - MPICH (or a derivative such as Intel MPI)
 - PETSc 3.16.5

=> a success would strongly point to Open MPI


Cheers,

Gilles

On Mon, Mar 14, 2022 at 2:56 PM Ernesto Prudencio via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Forgot to mention that in all 3 situations, mpirun is called as follows (35 
nodes, 4 MPI ranks per node):

mpirun -x LD_LIBRARY_PATH=:<PATH1>:<PATH2>:... -hostfile /tmp/hostfile.txt -np 
140 -npernode 4 --mca btl_tcp_if_include eth0 <APPLICATION_PATH> <APPLICATION 
OPTIONS>

So I have a question 3) Should I add some extra option in the mpirun command 
line in order to make situation 2 successful?

Thanks,

Ernesto.



Schlumberger-Private


Schlumberger-Private


Schlumberger-Private
From: users 
<users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> On 
Behalf Of Ernesto Prudencio via users
Sent: Monday, March 14, 2022 12:39 AM
To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: Ernesto Prudencio <epruden...@slb.com<mailto:epruden...@slb.com>>
Subject: Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning value 15

Thank you for the quick answer, George. I wanted to investigate the problem 
further before replying.

Below I show 3 situations of my C++ (and Fortran) application, which runs on 
top of PETSc, OpenMPI, and MKL. All 3 situations use MKL 2019.0.5 compiled with 
INTEL.

At the end, I have 2 questions.

Note: all codes are compiled in a certain set of nodes, and the execution 
happens at _another_ set of nodes.

+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - -

Situation 1) It has been successful for months now:

a) Use INTEL compilers for OpenMPI 4.0.3, PETSc 3.10.4 , and application. The 
configuration options for OpenMPI are:

'--with-flux-pmi=no' '--enable-orterun-prefix-by-default' 
'--prefix=/mnt/disks/intel-2018-3-222-blade-runtime-env-2018-1-07-08-2018-132838/openmpi_4.0.3_intel2019.5_gcc7.3.1'
 'FC=ifort' 'CC=gcc'

b) At run time, each MPI rank prints this info:

PATH = 
/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

LD_LIBRARY_PATH  = 
/opt/openmpi_4.0.3/lib::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/opt/petsc/lib:/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/opt/openmpi_4.0.3/lib:/lib64:/lib:/usr/lib64:/usr/lib

MPI version (compile time)   = 4.0.3
MPI_Get_library_version()    = Open MPI v4.0.3, package: Open MPI 
root@<STRING1> Distribution, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020
PETSc version (compile time) = 3.10.4

c) A test of 20 minutes with 14 nodes, 4 MPI ranks per node, runs ok.

d) A test of 2 hours with 35 nodes, 4 MPI ranks per node, runs ok.

+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - -

Situation 2) This situation is the one failing during execution.

a) Use GNU compilers for OpenMPI 4.1.2, PETSc 3.16.5 , and application. The 
configuration options for OpenMPI are:

'--with-flux-pmi=no' '--prefix=/appl-third-parties/openmpi-4.1.2' 
'--enable-orterun-prefix-by-default'

b) At run time, each MPI rank prints this info:

PATH  = 
/appl-third-parties/openmpi-4.1.2/bin:/appl-third-parties/openmpi-4.1.2/bin:/appl-third-parties/openmpi-4.1.2/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

LD_LIBRARY_PATH = 
/appl-third-parties/openmpi-4.1.2/lib::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/appl-third-parties/petsc-3.16.5/lib
:/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/appl-third-parties/openmpi-4.1.2/lib:/lib64:/lib:/usr/lib64:/usr/lib

MPI version (compile time)    = 4.1.2
MPI_Get_library_version()     = Open MPI v4.1.2, package: Open MPI 
root@<STRING2> Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021
PETSc version (compile time)  = 3.16.5
PetscGetVersion()                     = Petsc Release Version 3.16.5, Mar 04, 
2022
PetscGetVersionNumber()       = 3.16.5

c)  Same as (1.c)

d) Test with 35 nodes fails:
d.1) The very first MPI call is a MPI_Allreduce() with MPI_MAX op: it returns 
the right values only to rank 0, while all other ranks get value 0. The routine 
returns MPI_SUCCESS, though.
d.2) The second MPI call is a MPI_Allreduce() with MPI_SUM op: again, it 
returns the right values only to rank 0, while all other ranks get wrong values 
(mostly 0). The routine also returns MPI_SUCCESS, though.
d.3) The third MPI call is a MPI_Allreduce() with MPI_MIN op: it returns 15 = 
MPI_ERR_TRUNCATE. This is the error reported in my first e-mail of March 9.

+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - -

Situation 3) Runs ok!!!

a) Same as (2.a), that is, I continue to compile everything with GNU.

b) At run time, I only change the path of MPI to point to the "old" 
/opt/openmpi_4.0.3 compiled with INTEL. Each MPI rank prints this info:

PATH = 
/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

LD_LIBRARY_PATH = 
/opt/openmpi_4.0.3/lib::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/appl-third-parties/petsc-3.16.5/lib:/opt/2019.5/co
mpilers_and_libraries/linux/mkl/lib/intel64:/opt/openmpi_4.0.3/lib:/lib64:/lib:/lib64:/lib:/usr/lib64:/usr/lib

MPI version (compile time)     = 4.1.2
MPI_Get_library_version()      = Open MPI v4.0.3, package: Open MPI 
root@<STRING1> Distribution, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020
PETSc version (compile time)  = 3.16.5 (my observation here: this PETSc was 
compiled using OpenMPI 4.1.2)
PetscGetVersion()                     = Petsc Release Version 3.16.5, Mar 04, 
2022
PetscGetVersionNumber()      = 3.16.5

c) Same as (1.c)

d) Same as (1.d)

+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - -

Note: at run time, the nodes have both OpenMPI available (4.0.3 compiled with 
INTEL, and 4.1.2 compiled with GNU). That is why I can apply the "trick" of 
situation 3 above.

Question 1) Am I missing some configuration option on OpenMPI? I have been 
using the same OpenMPI configurations options of the stable situation 1.

Question 2) In the failing situation 2, does OpenMPI expect to use some /opt 
path, even though there is no PATH variable mentioning the "old" 
/opt/openmpi_4.0.3? I mean, could the problem be that I am providing the "new" 
OpenMPI 4.1.2 in a path (/appl-thrid-parties/...) that is NOT /opt?

Thank you,

Ernesto.

From: George Bosilca <bosi...@icl.utk.edu<mailto:bosi...@icl.utk.edu>>
Sent: Wednesday, March 9, 2022 1:46 PM
To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: Ernesto Prudencio <epruden...@slb.com<mailto:epruden...@slb.com>>
Subject: [Ext] Re: [OMPI users] Call to MPI_Allreduce() returning value 15

There are two ways the MPI_Allreduce returns MPI_ERR_TRUNCATE:
1. it is propagated from one of the underlying point-to-point communications, 
which means that at least one of the participants has an input buffer with a 
larger size. I know you said the size is fixed, but it only matters if all 
processes are in the same blocking MPI_Allreduce.
2. The code is not SPMD, and one of your processes calls a different 
MPI_Allreduce on the same communicator.

There is no simple way to get more information about this issue. If you have a 
version of OMPI compiled in debug mode, you can increase the verbosity of the 
collective framework to see if you get more interesting information.

George.


On Wed, Mar 9, 2022 at 2:23 PM Ernesto Prudencio via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Hello all,

The very simple code below returns mpiRC = 15.

const std::array< double, 2 > rangeMin { minX, minY };
std::array< double, 2 > rangeTempRecv { 0.0, 0.0 };
int mpiRC = MPI_Allreduce( rangeMin.data(), rangeTempRecv.data(), 
rangeMin.size(), MPI_DOUBLE, MPI_MIN, PETSC_COMM_WORLD );

Some information before my questions:

  1.  The environment I am running this code has hundreds of compute nodes, 
each node with 4 MPI ranks.
  2.  It is running in the cloud, so it is tricky to get extra information "on 
the fly".
  3.  I am using OpenMPI 4.1.2 + PETSc 3.16.5 + GNU compilers.
  4.  The error happens consistently at the same point in the execution, at 
ranks 1 and 2 only (out of hundreds of MPI ranks).
  5.  By the time the execution gets to the code above, the execution has 
already called PetscInitialize() and many MPI routines successfully
  6.  Before the call to MPI_Allreduce() above, the code calls MPI_Barrier(). 
So, all nodes call MPI_Allreduce()
  7.  At 
https://www.open-mpi.org/doc/current/man3/OpenMPI.3.php<https://urldefense.com/v3/__https:/www.open-mpi.org/doc/current/man3/OpenMPI.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE3Eo-Xebw$>
 it is written "MPI_ERR_TRUNCATE          15      Message truncated on receive."
  8.  At 
https://www.open-mpi.org/doc/v4.1/man3/MPI_Allreduce.3.php<https://urldefense.com/v3/__https:/www.open-mpi.org/doc/v4.1/man3/MPI_Allreduce.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE2WQh4XoA$>,
 it is written "The reduction functions ( MPI_Op ) do not return an error 
value. As a result, if the functions detect an error, all they can do is either 
call 
MPI_Abort<https://urldefense.com/v3/__https:/www.open-mpi.org/doc/v4.1/man3/MPI_Abort.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE19olVdVw$>
 or silently skip the problem. Thus, if you change the error handler from 
MPI_ERRORS_ARE_FATAL to something else, for example, MPI_ERRORS_RETURN , then 
no error may be indicated."

Questions:

  1.  Any ideas for what could be the cause for the return code 15? The code is 
pretty simple and the buffers have fixed size = 2.
  2.  In view of item (8), does it mean that the return code 15 in item (7) 
might not be informative?
  3.  Once I get a return code != MPI_SUCCESS, is there any routine I can call, 
in the application code, to get extra information on MPI?
  4.  Once the application aborts (I throw an exception once a return code is 
!= MPI_SUCESS), is there some command line I can run on all nodes in order to 
get extra info?

Thank you in advance,

Ernesto.


Schlumberger-Private


Schlumberger-Private

Reply via email to