I'm trying to replicate using the same compiler (icc 2019) on my OSX over
TCP and shared memory with no luck so far. So either the segfault it's
something specific to OmniPath or to the memcpy implementation used on
Skylake. I tried to use the trace you sent, more specifically the
opal_datatype_copy_content_same_ddt mention, to understand where the
segfault happen, but unfortunately there are 3 calls to
opal_datatype_copy_content_same_ddt in the reduce_scatter algorithm. Can
you please build in debug mode and if you can replicate the segfault send
me the stack trace.


On Tue, Dec 4, 2018 at 5:07 AM Peter Kjellström <c...@nsc.liu.se> wrote:

> On Mon, 3 Dec 2018 19:41:25 +0000
> "Hammond, Simon David via users" <users@lists.open-mpi.org> wrote:
> > Hi Open MPI Users,
> >
> > Just wanted to report a bug we have seen with OpenMPI 3.1.3 and 4.0.0
> > when using the Intel 2019 Update 1 compilers on our
> > Skylake/OmniPath-1 cluster. The bug occurs when running the Github
> > master src_c variant of the Intel MPI Benchmarks.
> I've noticed this also when using intel mpi (2018 and 2019u1). I
> classified it as a bug in imb but didn't look too deep (new
> reduce_scatter code).
> /Peter K
> --
> Sent from my Android device with K-9 Mail. Please excuse my
> brevity._______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
users mailing list

Reply via email to