in order to exclude the coll/tuned component:

mpirun --mca coll ^tuned ...


Cheers,

Gilles

On Mon, Mar 14, 2022 at 5:37 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Thanks for the hint on “mpirun ldd”. I will try it. The problem is that I
> am running on the cloud and it is trickier to get into a node at run time,
> or save information to be retrieved later.
>
>
>
> Sorry for my ignorance on mca stuff, but what would exactly be the
> suggested mpirun command line options on coll / tuned?
>
>
>
> Cheers,
>
>
>
> Ernesto.
>
>
>
> *From:* users <users-boun...@lists.open-mpi.org> *On Behalf Of *Gilles
> Gouaillardet via users
> *Sent:* Monday, March 14, 2022 2:22 AM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Ernesto,
>
>
>
> you can
>
> mpirun ldd <your binary>
>
>
>
> and double check it uses the library you expect.
>
>
>
>
>
> you might want to try adapting your trick to use Open MPI 4.1.2 with your
> binary built with Open MPI 4.0.3 and see how it goes.
>
> i'd try disabling coll/tuned first though.
>
>
>
>
>
> Keep in mind PETSc might call MPI_Allreduce under the hood with matching
> but different signatures.
>
>
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
> On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Thanks, Gilles.
>
>
>
> In the case of the application I am working on, all ranks call MPI with
> the same signature / types of variables.
>
>
>
> I do not think there is a code error anywhere. I think this is “just” a
> configuration error from my part.
>
>
>
> Regarding the idea of changing just one item at a time: that would be the
> next step, but first I would like to check if my suspicion that the
> presence of both “/opt/openmpi_4.0.3” and
> “/appl-third-parties/openmpi-4.1.2” at run time could be an issue:
>
>    - It is an issue on situation 2, when I explicitly point the runtime
>    mpi to be 4.1.2 (also used in compilation)
>    - It is not an issue on situation 3, when I explicitly point the
>    runtime mpi to be 4.0.3 compiled with INTEL (even though I compiled the
>    application and openmpi 4.1.2 with GNU, and I link the application with
>    openmpi 4.1.2)
>
>
>
> Best,
>
>
>
> Ernesto.
>
>
>
> *From:* Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> *Sent:* Monday, March 14, 2022 1:37 AM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Ernesto Prudencio <epruden...@slb.com>
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Ernesto,
>
>
>
> the coll/tuned module (that should handle collective subroutines by
> default) has a known issue when matching but non identical signatures are
> used:
>
> for example, one rank uses one vector of n bytes, and an other rank uses n
> bytes.
>
> Is there a chance your application might use this pattern?
>
>
>
> You can give try disabling this component with
>
> mpirun --mca coll ^tuned ...
>
>
>
>
>
> I noted between the successful a) case and the unsuccessful b) case, you
> changed 3 parameters:
>
>  - compiler vendor
>
>  - Open MPI version
>
>  - PETSc 3.10.4
>
> so at this stage, it is not obvious which should be blamed for the failure.
>
>
>
>
>
> In order to get a better picture, I would first try
>
>  - Intel compilers
>
>  - Open MPI 4.1.2
>
>  - PETSc 3.10.4
>
>
>
> => a failure would suggest a regression in Open MPI
>
>
>
> And then
>
>  - Intel compilers
>
>  - Open MPI 4.0.3
>
>  - PETSc 3.16.5
>
>
>
> => a failure would either suggest a regression in PETSc, or PETSc doing
> something different but legit that evidences a bug in Open MPI.
>
>
>
> If you have time, you can also try
>
>  - Intel compilers
>
>  - MPICH (or a derivative such as Intel MPI)
>
>  - PETSc 3.16.5
>
>
>
> => a success would strongly point to Open MPI
>
>
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
> On Mon, Mar 14, 2022 at 2:56 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Forgot to mention that in all 3 situations, mpirun is called as follows
> (35 nodes, 4 MPI ranks per node):
>
>
>
> mpirun -x LD_LIBRARY_PATH=:<PATH1>:<PATH2>:… -hostfile /tmp/hostfile.txt
> -np 140 -npernode 4 --mca btl_tcp_if_include eth0 <APPLICATION_PATH>
> <APPLICATION OPTIONS>
>
>
>
> So I have a question 3) Should I add some extra option in the mpirun
> command line in order to make situation 2 successful?
>
>
>
> Thanks,
>
>
>
> Ernesto.
>
>
>
>
>
> Schlumberger-Private
>
>
>
> Schlumberger-Private
>
>
>
> Schlumberger-Private
>
> *From:* users <users-boun...@lists.open-mpi.org> *On Behalf Of *Ernesto
> Prudencio via users
> *Sent:* Monday, March 14, 2022 12:39 AM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Ernesto Prudencio <epruden...@slb.com>
> *Subject:* Re: [OMPI users] [Ext] Re: Call to MPI_Allreduce() returning
> value 15
>
>
>
> Thank you for the quick answer, George. I wanted to investigate the
> problem further before replying.
>
>
>
> Below I show 3 situations of my C++ (and Fortran) application, which runs
> on top of PETSc, OpenMPI, and MKL. All 3 situations use MKL 2019.0.5
> compiled with INTEL.
>
>
>
> At the end, I have 2 questions.
>
>
>
> Note: all codes are compiled in a certain set of nodes, and the execution
> happens at _*another*_ set of nodes.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Situation 1) It has been successful for months now:
>
>
>
> a) Use INTEL compilers for OpenMPI 4.0.3, PETSc 3.10.4 , and application.
> The configuration options for OpenMPI are:
>
>
>
> '--with-flux-pmi=no' '--enable-orterun-prefix-by-default'
> '--prefix=/mnt/disks/intel-2018-3-222-blade-runtime-env-2018-1-07-08-2018-132838/openmpi_4.0.3_intel2019.5_gcc7.3.1'
> 'FC=ifort' 'CC=gcc'
>
>
>
> b) At run time, each MPI rank prints this info:
>
>
>
> PATH =
> /opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>
>
>
> LD_LIBRARY_PATH  =
> /opt/openmpi_4.0.3/lib::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/opt/petsc/lib:/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/opt/openmpi_4.0.3/lib:/lib64:/lib:/usr/lib64:/usr/lib
>
>
>
> MPI version (compile time)   = 4.0.3
>
> MPI_Get_library_version()    = Open MPI v4.0.3, package: Open MPI 
> root@<STRING1>
> Distribution, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020
>
> PETSc version (compile time) = 3.10.4
>
>
>
> c) A test of 20 minutes with 14 nodes, 4 MPI ranks per node, runs ok.
>
>
>
> d) A test of 2 hours with 35 nodes, 4 MPI ranks per node, runs ok.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Situation 2) This situation is the one failing during execution.
>
>
>
> a) Use GNU compilers for OpenMPI 4.1.2, PETSc 3.16.5 , and application.
> The configuration options for OpenMPI are:
>
>
>
> '--with-flux-pmi=no' '--prefix=/appl-third-parties/openmpi-4.1.2'
> '--enable-orterun-prefix-by-default'
>
>
>
> b) At run time, each MPI rank prints this info:
>
>
>
> PATH  = /appl-third-parties/openmpi-4.1.2/bin
> :/appl-third-parties/openmpi-4.1.2/bin:/appl-third-parties/openmpi-4.1.2/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>
>
>
> LD_LIBRARY_PATH = /appl-third-parties/openmpi-4.1.2/lib
> ::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/appl-third-parties/petsc-3.16.5/lib
>
>
> :/opt/2019.5/compilers_and_libraries/linux/mkl/lib/intel64:/appl-third-parties/openmpi-4.1.2/lib:/lib64:/lib:/usr/lib64:/usr/lib
>
>
>
> MPI version (compile time)    = 4.1.2
>
> MPI_Get_library_version()     = Open MPI v4.1.2, package: Open MPI 
> root@<STRING2>
> Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021
>
> PETSc version (compile time)  = 3.16.5
>
> PetscGetVersion()                     = Petsc Release Version 3.16.5, Mar
> 04, 2022
>
> PetscGetVersionNumber()       = 3.16.5
>
>
>
> c)  Same as (1.c)
>
>
>
> d) Test with 35 nodes fails:
>
> d.1) The very first MPI call is a MPI_Allreduce() with MPI_MAX op: it
> returns the right values only to rank 0, while all other ranks get value
> 0. The routine returns MPI_SUCCESS, though.
>
> d.2) The second MPI call is a MPI_Allreduce() with MPI_SUM op: again, it
> returns the right values only to rank 0, while all other ranks get wrong
> values (mostly 0). The routine also returns MPI_SUCCESS, though.
>
> d.3) The third MPI call is a MPI_Allreduce() with MPI_MIN op: it returns 15
> = MPI_ERR_TRUNCATE. This is the error reported in my first e-mail of
> March 9.
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Situation 3) Runs ok!!!
>
>
>
> a) Same as (2.a), that is, I continue to compile everything with GNU.
>
>
>
> b) At run time, I only change the path of MPI to point to the "old"
> /opt/openmpi_4.0.3 compiled with INTEL. Each MPI rank prints this info:
>
>
>
> PATH = /opt/openmpi_4.0.3/bin
> :/opt/openmpi_4.0.3/bin:/opt/openmpi_4.0.3/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>
>
>
> LD_LIBRARY_PATH = /opt/openmpi_4.0.3/lib
> ::/opt/rh/devtoolset-7/root/usr/lib64:/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7:/appl-third-parties/petsc-3.16.5/lib:/opt/2019.5/co
>
>
> mpilers_and_libraries/linux/mkl/lib/intel64:/opt/openmpi_4.0.3/lib:/lib64:/lib:/lib64:/lib:/usr/lib64:/usr/lib
>
>
>
> MPI version (compile time)     = 4.1.2
>
> MPI_Get_library_version()      = Open MPI v4.0.3, package: Open MPI 
> root@<STRING1>
> Distribution, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020
>
> PETSc version (compile time)  = 3.16.5 (my observation here: this PETSc
> was compiled using OpenMPI 4.1.2)
>
> PetscGetVersion()                     = Petsc Release Version 3.16.5, Mar
> 04, 2022
>
> PetscGetVersionNumber()      = 3.16.5
>
>
>
> c) Same as (1.c)
>
>
>
> d) Same as (1.d)
>
>
>
> +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
>
>
> Note: at run time, the nodes have both OpenMPI available (4.0.3 compiled
> with INTEL, and 4.1.2 compiled with GNU). That is why I can apply the
> “trick” of situation 3 above.
>
>
>
> Question 1) Am I missing some configuration option on OpenMPI? I have been
> using the same OpenMPI configurations options of the stable situation 1.
>
>
>
> Question 2) In the failing situation 2, does OpenMPI expect to use some
> /opt path, even though there is no PATH variable mentioning the “old”
> /opt/openmpi_4.0.3? I mean, could the problem be that I am providing the
> “new” OpenMPI 4.1.2 in a path (/appl-thrid-parties/…) that is NOT /opt?
>
>
>
> Thank you,
>
>
>
> Ernesto.
>
>
>
> *From:* George Bosilca <bosi...@icl.utk.edu>
> *Sent:* Wednesday, March 9, 2022 1:46 PM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Ernesto Prudencio <epruden...@slb.com>
> *Subject:* [Ext] Re: [OMPI users] Call to MPI_Allreduce() returning value
> 15
>
>
>
> There are two ways the MPI_Allreduce returns MPI_ERR_TRUNCATE:
>
> 1. it is propagated from one of the underlying point-to-point
> communications, which means that at least one of the participants has an
> input buffer with a larger size. I know you said the size is fixed, but it
> only matters if all processes are in the same blocking MPI_Allreduce.
>
> 2. The code is not SPMD, and one of your processes calls a different
> MPI_Allreduce on the same communicator.
>
>
>
> There is no simple way to get more information about this issue. If you
> have a version of OMPI compiled in debug mode, you can increase the
> verbosity of the collective framework to see if you get more interesting
> information.
>
>
>
> George.
>
>
>
>
>
> On Wed, Mar 9, 2022 at 2:23 PM Ernesto Prudencio via users <
> users@lists.open-mpi.org> wrote:
>
> Hello all,
>
>
>
> The very simple code below returns mpiRC = 15.
>
>
>
> const std::array< double, 2 > rangeMin { minX, minY };
>
> std::array< double, 2 > rangeTempRecv { 0.0, 0.0 };
>
> int mpiRC = MPI_Allreduce( rangeMin.data(), rangeTempRecv.data(),
> rangeMin.size(), MPI_DOUBLE, MPI_MIN, PETSC_COMM_WORLD );
>
>
>
> Some information before my questions:
>
>    1. The environment I am running this code has hundreds of compute
>    nodes, each node with 4 MPI ranks.
>    2. It is running in the cloud, so it is tricky to get extra
>    information “on the fly”.
>    3. I am using OpenMPI 4.1.2 + PETSc 3.16.5 + GNU compilers.
>    4. The error happens consistently at the same point in the execution,
>    at ranks 1 and 2 only (out of hundreds of MPI ranks).
>    5. By the time the execution gets to the code above, the execution has
>    already called PetscInitialize() and many MPI routines successfully
>    6. Before the call to MPI_Allreduce() above, the code calls
>    MPI_Barrier(). So, all nodes call MPI_Allreduce()
>    7. At https://www.open-mpi.org/doc/current/man3/OpenMPI.3.php
>    
> <https://urldefense.com/v3/__https:/www.open-mpi.org/doc/current/man3/OpenMPI.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE3Eo-Xebw$>
>    it is written “MPI_ERR_TRUNCATE          15      Message truncated on
>    receive.”
>    8. At https://www.open-mpi.org/doc/v4.1/man3/MPI_Allreduce.3.php
>    
> <https://urldefense.com/v3/__https:/www.open-mpi.org/doc/v4.1/man3/MPI_Allreduce.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE2WQh4XoA$>,
>    it is written “The reduction functions ( *MPI_Op* ) do not return an
>    error value. As a result, if the functions detect an error, all they can do
>    is either call *MPI_Abort*
>    
> <https://urldefense.com/v3/__https:/www.open-mpi.org/doc/v4.1/man3/MPI_Abort.3.php__;!!Kjv0uj3L4nM6H-I!wS37Nk1AtIBFQXXmEOtP8UEWGnLUdtL5BB5vOPisS0qoHGf7Pmq6bE19olVdVw$>
>  or
>    silently skip the problem. Thus, if you change the error handler from
>    *MPI_ERRORS_ARE_FATAL* to something else, for example,
>    *MPI_ERRORS_RETURN* , then no error may be indicated.”
>
>
>
> Questions:
>
>    1. Any ideas for what could be the cause for the return code 15? The
>    code is pretty simple and the buffers have fixed size = 2.
>    2. In view of item (8), does it mean that the return code 15 in item
>    (7) might not be informative?
>    3. Once I get a return code != MPI_SUCCESS, is there any routine I can
>    call, in the application code, to get extra information on MPI?
>    4. Once the application aborts (I throw an exception once a return
>    code is != MPI_SUCESS), is there some command line I can run on all nodes
>    in order to get extra info?
>
>
>
> Thank you in advance,
>
>
>
> Ernesto.
>
>
>
> Schlumberger-Private
>
>
>
> Schlumberger-Private
>
>

Reply via email to