Thanks Ben !
I opened https://github.com/open-mpi/ompi/issues/6016 in order to track
this issue, and wrote a simpler example that evidences this issue.
We should follow-up there from now.
fwiw, several bug fixes have not been backported into the v3 branches.
Note that using the ddt
Things that read like they should be unsigned look suspicious to me:
nbElems -909934592
count -1819869184
Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov
> On Nov 1, 2018, at 10:34 PM, Ben Menadue wrote:
>
> Hi,
>
> I haven’t heard back from the user yet, but I just put this
Hi,I haven’t heard back from the user yet, but I just put this example together which works on 1, 2, and 3 ranks but fails for 4. Unfortunately it needs a fair amount of memory, about 14.3GB per process, so I was running it with -map-by ppr:1:node.It doesn’t fail with the segfault as the user’s
HI Gilles,
> On 2 Nov 2018, at 11:03 am, Gilles Gouaillardet wrote:
> I noted the stack traces refers opal_cuda_memcpy(). Is this issue specific to
> CUDA environments ?
No, this is just on normal CPU-only nodes. But memcpy always goes through
opal_cuda_memcpy when CUDA support is enabled,
Hi Ben,
I noted the stack traces refers opal_cuda_memcpy(). Is this issue
specific to CUDA environments ?
The coll/tuned default collective module is known not to work when tasks
use matching but different signatures.
For example, one task sends one vector of N elements, and the other