Re: [OMPI users] Not getting zero-copy with custom datatype

2024-04-23 Thread George Bosilca via users
zero copy does not work with non-contiguous datatypes (it would require
both processes to know the memory layout used by the peer). As long as the
memory layout described by the type can be seen as contiguous (even if
described otherwise), it should work just fine.

  George.

On Tue, Apr 23, 2024 at 10:02 AM Pascal Boeschoten via users <
users@lists.open-mpi.org> wrote:

> Hello,
>
> I'm using a custom datatype created through MPI_Type_create_struct() to
> send data with a dynamic structure to another process on the same node over
> shared memory, and noticed it's much slower than expected.
>
> I ran a profile, and it looks like it's not using CMA zero-copy, falling
> back to using opal_generic_simple_pack()/opal_generic_simple_unpack().
> Simpler datatypes do seem to use zero-copy, using mca_btl_vader_get_cma(),
> so I don't think it's a configuration or system issue.
>
> I suspect it's because the struct datatype is not contiguous, i.e. the
> blocks of the struct have gaps between them.
> Is anyone able to confirm whether zero-copy with an MPI struct requires a
> contiguous data structure, and whether it has other requirements like the
> displacements being in ascending order, having homogeneous block
> datatypes/lengths, etc?
>
> I'm using OpenMPI 4.1.6.
>
> Thanks,
> Pascal Boeschoten
>


[OMPI users] Not getting zero-copy with custom datatype

2024-04-23 Thread Pascal Boeschoten via users
Hello,

I'm using a custom datatype created through MPI_Type_create_struct() to
send data with a dynamic structure to another process on the same node over
shared memory, and noticed it's much slower than expected.

I ran a profile, and it looks like it's not using CMA zero-copy, falling
back to using opal_generic_simple_pack()/opal_generic_simple_unpack().
Simpler datatypes do seem to use zero-copy, using mca_btl_vader_get_cma(),
so I don't think it's a configuration or system issue.

I suspect it's because the struct datatype is not contiguous, i.e. the
blocks of the struct have gaps between them.
Is anyone able to confirm whether zero-copy with an MPI struct requires a
contiguous data structure, and whether it has other requirements like the
displacements being in ascending order, having homogeneous block
datatypes/lengths, etc?

I'm using OpenMPI 4.1.6.

Thanks,
Pascal Boeschoten