Hi Howard and Michael,
thanks for your feedback. I did not want to write a toot long mail with
non pertinent information so I just show how the two different builds
give different result. I'm using a small test case based on my large
code, the same used to show the memory leak with mpi_Alltoallv
t expect 4007
but it fails too.
Patrick
Le 25/01/2021 à 19:34, Ralph Castain via users a écrit :
> I think you mean add "--mca mtl ofi" to the mpirun cmd line
>
>
>> On Jan 25, 2021, at 10:18 AM, Heinz, Michael William via users
>> wrote:
>>
>> What
Hi,
I'm trying to deploy OpenMPI 4.0.5 on the university's supercomputer:
* Debian GNU/Linux 9 (stretch)
* Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] (rev 11)
and for several days I have a bug (wrong results using MPI_AllToAllW) on
this server when using OmniPath.
e if it's supposed to stop at some point
>
> i'm running rhel7, gcc 10.1, openmpi 4.0.5rc2, with-ofi,
> without-{psm,ucx,verbs}
>
> On Tue, Jan 26, 2021 at 3:44 PM Patrick Begou via users
> wrote:
> >
> > Hi Michael
> >
at reproduces
> the problem? I can’t think of another way I can give you more help
> without being able to see what’s going on. It’s always possible
> there’s a bug in the PSM2 MTL but it would be surprising at this point.
>
> Sent from my iPad
>
>> On Jan 26, 2021, at 1:1
Hi all,
I ran many tests today. I saw that an older 4.0.2 version of OpenMPI
packaged with Nix was running using openib. So I add the --with-verbs
option to setup this module.
That I can see now is that:
mpirun -hostfile $OAR_NODEFILE *--mca mtl psm -mca btl_openib_allow_ib
true*
-
help
>
> if i had to guess totally pulling junk from the air, there's probably
> something incompatible with PSM and OPA when running specifically on debian
> (likely due to library versioning). i don't know how common that is, so it's
> not clear how flushed out and tested it is
Hi,
I meet a performance problem with OpenMPI on my cluster. In some
situation my parallel code is really slow (same binary running on a
different mesh).
To investigate, the fortran code code is built with profiling option
(mpifort -p -O3.) and launched on 91 cores.
One mon.out file
Le 28/02/2022 à 17:56, Patrick Begou via users a écrit :
Hi,
I meet a performance problem with OpenMPI on my cluster. In some
situation my parallel code is really slow (same binary running on a
different mesh).
To investigate, the fortran code code is built with profiling option
(mpifort
btl?
If the former, is it built with multi threading support?
If the latter, I suggest you give UCX - built with multi threading
support - a try and see how it goes
Cheers,
Gilles
On Thu, Mar 24, 2022 at 5:43 PM Patrick Begou via users
wrote:
Le 28/02/2022 à 17:56, Patrick Begou via
G6VadQJ@univ-grenoble-alpes.fr]
Le 16/06/2022 à 14:30, Jeff Squyres (jsquyres) a écrit :
What exactly is the error that is occurring?
--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>
From: users<mailto:users-boun...@lists.open-mp
Hi all,
we are facing a serious problem with OpenMPI (4.0.2) that we have
deployed on a cluster. We do not manage this large cluster and the names
of the nodes do not agree with Internet standards for protocols: they
contain a "_" (underscore) character.
So OpenMPI complains about this and
that is occurring?
--
Jeff Squyres
jsquy...@cisco.com
From: users on behalf of Patrick Begou via
users
Sent: Thursday, June 16, 2022 3:21 AM
To: Open MPI Users
Cc: Patrick Begou
Subject: [OMPI users] OpenMPI and names of the nodes in a cluster
Hi all,
we
13 matches
Mail list logo