Jonas,

In case I misunderstood your question and you want to print

v_glob on P0: 9x2

0 9

1 10

2 11

3 12

4 13

5 14

6 15

7 16

8 17

then you have to fix the print invocation

// note: print an additional column to show the displacement error we get:

if (!rank) print("v_glob", rank, n, m, v_glob);

and also resize rtype so the second element starts at v_glob[3][0] => upper
bound = (3*sizeof(int))

By the way, since this question is not Open MPI specific, sites such as
Stack Overflow are a better fit.


Cheers,

Gilles
On Thu, Dec 16, 2021 at 6:46 PM Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Jonas,
>
> Assuming v_glob is what you expect, you will need to
> `MPI_Type_create_resized_type()` the received type so the block received
> from process 1 will be placed at the right position (v_glob[3][1] => upper
> bound = ((4*3+1) * sizeof(int))
>
> Cheers,
>
> Gilles
>
> On Thu, Dec 16, 2021 at 6:33 PM Jonas Thies via users <
> users@lists.open-mpi.org> wrote:
>
>> Dear OpenMPI community,
>>
>> Here's a little puzzle for the Christmas holidays (although I would
>> really appreciate a quick solution!).
>>
>> I'm stuck with the following relatively basic problem: given a local nloc
>> x m matrix X_p in column-major ordering on each MPI process p, perform a
>> single MPI_Gather operation to construct the matrix
>> X_0
>> X_1
>> ...
>>
>> X_nproc
>>
>> again, in col-major ordering. My approach is to use MPI_Type_vector to
>> define an stype and an rtype, where stype has stride nloc, and rtype has
>> stride nproc*nloc. The observation is that there is an unexpected
>> displacement of (m-1)*n*p in the result array for the part arriving from
>> process p.
>>
>> The MFE code is attached, and I use OpenMPI 4.0.5 with GCC 11.2 (although
>> other versions and even distributions seem to display the same behavior).
>> Example (nloc=3, nproc=3, m=2, with some additional columns printed for the
>> sake of demonstration):
>>
>>
>> > mpicxx -o matrix_gather matrix_gather.cpp
>> mpirun -np 3 ./matrix_gather
>>
>> v_loc on P0: 3x2
>> 0 9
>> 1 10
>> 2 11
>>
>> v_loc on P1: 3x2
>> 3 12
>> 4 13
>> 5 14
>>
>> v_loc on P2: 3x2
>> 6 15
>> 7 16
>> 8 17
>>
>> v_glob on P0: 9x4
>> 0 9 0 0
>> 1 10 0 0
>> 2 11 0 0
>> 0 3 12 0
>> 0 4 13 0
>> 0 5 14 0
>> 0 0 6 15
>> 0 0 7 16
>> 0 0 8 17
>>
>> Any ideas?
>>
>> Thanks,
>>
>> Jonas
>>
>>
>> --
>> *J. Thies*
>> Assistant Professor
>>
>> TU Delft
>> Faculty Electrical Engineering, Mathematics and Computer Science
>> Institute of Applied Mathematics and High Performance Computing Center
>> Mekelweg 4
>> 2628 CD Delft
>>
>> T +31 15 27 XXXX
>> *j.th...@tudelft.nl <j.th...@tudelft.nl>*
>>
>

Reply via email to