Nov 2018, at 12:09 pm, Ben Menadue <ben.mena...@nci.org.au> wrote:HI Gilles,On 2 Nov 2018, at 11:03 am, Gilles Gouaillardet <gil...@rist.or.jp> wrote:I noted the stack traces refers opal_cuda_memcpy(). Is this issue specific to CUDA environments ?No, this is just on normal CPU-only no
> On 2 Nov 2018, at 11:03 am, Gilles Gouaillardet wrote:
> I noted the stack traces refers opal_cuda_memcpy(). Is this issue specific to
> CUDA environments ?
No, this is just on normal CPU-only nodes. But memcpy always goes through
opal_cuda_memcpy when CUDA support is enabled,
One of our users is reporting an issue using MPI_Allgatherv with a large
derived datatype — it segfaults inside OpenMPI. Using a debug build of OpenMPI
3.1.2 produces a ton of messages like this before the segfault:
What’s the replacement that it should use instead? I’m pretty sure oob/ud is
being picked by default on our IB cluster. Or is oob/tcp good enough?
> On 20 Jun 2018, at 5:20 am, Jeff Squyres (jsquyres) via devel
> We talked about this on the webex today, but
(since the problem is different in
> the various releases) in the next few days that points to the problems.
> Comm_spawn is okay, FWIW
>> On May 21, 2018, at 8:00 PM, Ben Menadue <ben.mena...@nci.org.au
That said, I’m not sure why get_tracker is reporting 32 procs — there’s only 16
running here (i.e. 1 original + 15 spawned).
Or should I post this over in the PMIx list instead?
> On 17 May 2018, at 9:59 am, Ben Menadue <ben.mena...@nci.org.au> wrote
I having trouble using map by socket on remote nodes.
Running on the same node as mpirun works fine (except for that spurious
$ mpirun -H localhost:16 -map-by ppr:2:socket:PE=4 -display-map /bin/true
[raijin7:22248] SETTING BINDING TO CORE
Data for JOB [11140,1] offset 0
I’m seeing an extraneous “DONE” message being printed with OpenMPI 3.0.0 when
mapping by core:
[bjm900@raijin7 pt2pt]$ mpirun -np 2 ./osu_bw > /dev/null
[bjm900@raijin7 pt2pt]$ mpirun -map-by core -np 2 ./osu_bw > /dev/null
This patch gets rid of the offending line —
elcome to pull down the patch and locally apply it if it would help.
> On Aug 24, 2016, at 5:29 PM, r...@open-mpi.org wrote:
> Hmmm...bet I know why. Let me poke a bit.
>> On Aug 24, 2016, at 5:18 PM, Ben Menadue <ben.mena...@nci.org.au> wrote:
t;. Adding --map-by core:oversubscribe
makes this to work, but then doesn't have binding.
From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of Ben Menadue
Sent: Thursday, 25 August 2016 9:36 AM
To: 'Open MPI Developers' <email@example.com
could pull the patch in advance if it is holding you up.
>> On Aug 23, 2016, at 11:46 PM, Ben Menadue <ben.mena...@nci.org.au> wrote:
>> One of our users has noticed that binding is disabled in 2.0.0 when
>> --oversubscribe is pa
One of our users has noticed that binding is disabled in 2.0.0 when
--oversubscribe is passed, which is hurting their performance, likely
through migrations between sockets. It looks to be because of 294793c
They need to use --oversubscribe as for some reason the developers
Looks like there's a #include missing from
oshmem/shmem/fortran/shmem_put_nb_f.c. It's causing MCA_SPML_CALL to show up
as an undefined symbol, even though it's a macro (among other things). The
#include is in shmem_get_nb_f.c but not ..._put_...
Patch against master (0e433ea):
$ git diff
, but that was before
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Dave Turner
Sent: Friday, 4 March 2016 3:28 PM
To: Ben Menadue <ben.mena...@nci.org.au>
Cc: Open MPI Developers <de...@open-mpi.org>
Subject: Re: [OMPI devel] mpif.h on Intel bu
The issue is the way MPI_Sizeof is handled; it's implemented as a series of
interfaces that map the MPI_Sizeof call to the right function in the library. I
suspect this is needed because that function doesn't take a datatype argument
and instead infers this from the argument types
I just finished building 1.8.6 and master on our cluster and noticed that
for both, XRC support wasn't being detected because it didn't detect the
checking whether IBV_SRQT_XRC is declared... (cached) no
checking if ConnectX XRC support
Mail list logo