in order to exclude the coll/tuned component:
mpirun --mca coll ^tuned ...
Cheers,
Gilles
On Mon, Mar 14, 2022 at 5:37 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:
> Thanks for the hint on “mpirun ldd”. I will try it. The problem is that I
> am running on the cloud and
Thanks for the hint on "mpirun ldd". I will try it. The problem is that I am
running on the cloud and it is trickier to get into a node at run time, or save
information to be retrieved later.
Sorry for my ignorance on mca stuff, but what would exactly be the suggested
mpirun command line
Ernesto,
you can
mpirun ldd
and double check it uses the library you expect.
you might want to try adapting your trick to use Open MPI 4.1.2 with your
binary built with Open MPI 4.0.3 and see how it goes.
i'd try disabling coll/tuned first though.
Keep in mind PETSc might call MPI_Allreduce
Thanks, Gilles.
In the case of the application I am working on, all ranks call MPI with the
same signature / types of variables.
I do not think there is a code error anywhere. I think this is "just" a
configuration error from my part.
Regarding the idea of changing just one item at a time:
Ernesto,
the coll/tuned module (that should handle collective subroutines by
default) has a known issue when matching but non identical signatures are
used:
for example, one rank uses one vector of n bytes, and an other rank uses n
bytes.
Is there a chance your application might use this pattern?