Re: [petsc-users] Scattering a vector to/from a subset of processors

Sreeram R Venkat Wed, 06 Dec 2023 11:21:13 -0800

Thank you for your help. It turned out the problem was that I forgot to
assemble the "x" vector before doing the scatter. It seems to be working
now.


Thanks,
Sreeram

On Wed, Dec 6, 2023 at 11:12 AM Junchao Zhang <[email protected]>
wrote:

> Hi, Sreeram,
>   I made an example with your approach.  It worked fine as you see the
> output at the end
>
> #include <petscvec.h>
> int main(int argc, char **argv)
> {
> PetscInt i, j, rstart, rend, n, N, *indices;
> PetscMPIInt size, rank;
> IS ix;
> VecScatter vscat;
> Vec x, y;
>
> PetscFunctionBeginUser;
> PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>
> PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
> PetscCall(VecSetFromOptions(x));
> PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
> n = (rank < 4) ? 9 : 0;
> PetscCall(VecSetSizes(x, n, PETSC_DECIDE));
>
> PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
> for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i,
> (PetscScalar)i, INSERT_VALUES));
> PetscCall(VecAssemblyBegin(x));
> PetscCall(VecAssemblyEnd(x));
> PetscCall(VecGetSize(x, &N));
>
> PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
> PetscCall(VecSetFromOptions(y));
> PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
> PetscCall(VecSetSizes(y, PETSC_DECIDE, N));
>
> PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
> PetscCall(PetscMalloc1(rend - rstart, &indices));
> for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;
>
> PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
> PETSC_OWN_POINTER, &ix));
> PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));
>
> PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
> PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>
> PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
> PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
> PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));
>
> PetscCall(VecScatterDestroy(&vscat));
> PetscCall(ISDestroy(&ix));
> PetscCall(VecDestroy(&x));
> PetscCall(VecDestroy(&y));
> PetscCall(PetscFinalize());
> return 0;
> }
>
> $ mpirun -n 12 ./ex100
> IS Object: 12 MPI processes
>   type: general
> [0] Number of indices in set 3
> [0] 0 0
> [0] 1 12
> [0] 2 24
> [1] Number of indices in set 3
> [1] 0 1
> [1] 1 13
> [1] 2 25
> [2] Number of indices in set 3
> [2] 0 2
> [2] 1 14
> [2] 2 26
> [3] Number of indices in set 3
> [3] 0 3
> [3] 1 15
> [3] 2 27
> [4] Number of indices in set 3
> [4] 0 4
> [4] 1 16
> [4] 2 28
> [5] Number of indices in set 3
> [5] 0 5
> [5] 1 17
> [5] 2 29
> [6] Number of indices in set 3
> [6] 0 6
> [6] 1 18
> [6] 2 30
> [7] Number of indices in set 3
> [7] 0 7
> [7] 1 19
> [7] 2 31
> [8] Number of indices in set 3
> [8] 0 8
> [8] 1 20
> [8] 2 32
> [9] Number of indices in set 3
> [9] 0 9
> [9] 1 21
> [9] 2 33
> [10] Number of indices in set 3
> [10] 0 10
> [10] 1 22
> [10] 2 34
> [11] Number of indices in set 3
> [11] 0 11
> [11] 1 23
> [11] 2 35
> Vec Object: Vec X 12 MPI processes
>   type: mpi
> Process [0]
> 0.
> 1.
> 2.
> 3.
> 4.
> 5.
> 6.
> 7.
> 8.
> Process [1]
> 9.
> 10.
> 11.
> 12.
> 13.
> 14.
> 15.
> 16.
> 17.
> Process [2]
> 18.
> 19.
> 20.
> 21.
> 22.
> 23.
> 24.
> 25.
> 26.
> Process [3]
> 27.
> 28.
> 29.
> 30.
> 31.
> 32.
> 33.
> 34.
> 35.
> Process [4]
> Process [5]
> Process [6]
> Process [7]
> Process [8]
> Process [9]
> Process [10]
> Process [11]
> Vec Object: Vec Y 12 MPI processes
>   type: mpi
> Process [0]
> 0.
> 12.
> 24.
> Process [1]
> 1.
> 13.
> 25.
> Process [2]
> 2.
> 14.
> 26.
> Process [3]
> 3.
> 15.
> 27.
> Process [4]
> 4.
> 16.
> 28.
> Process [5]
> 5.
> 17.
> 29.
> Process [6]
> 6.
> 18.
> 30.
> Process [7]
> 7.
> 19.
> 31.
> Process [8]
> 8.
> 20.
> 32.
> Process [9]
> 9.
> 21.
> 33.
> Process [10]
> 10.
> 22.
> 34.
> Process [11]
> 11.
> 23.
> 35.
>
> --Junchao Zhang
>
>
> On Tue, Dec 5, 2023 at 10:09 PM Sreeram R Venkat <[email protected]>
> wrote:
>
>> Yes, I have an example code at github.com/s769/petsc-test. Only thing
>> is, when I described the example before, I simplified the actual use case
>> in the code to make things simpler. Here are the extra details relevant to
>> this code:
>>
>>    - We assume a 2D processor grid, given by the command-line args
>>    -proc_rows and -proc_cols
>>    - The total length of the vector is n_m*n_t given by command-line
>>    args -nm and -nt. n_m corresponds to a space index and n_t a time index.
>>    - In the "Start" phase, the vector is divided into n_m blocks each of
>>    size n_t (indexed space->time). The blocks are partitioned over the first
>>    row of processors. For example if -nm = 4 and -proc_cols = 4, each
>>    processor in the first row will get one block of size n_t. Each processor
>>    in the first row gets n_m_local blocks of size n_t, where the sum of all
>>    n_m_locals for the first row of processors is n_m.
>>    - In the "Finish" phase, the vector is divided into n_t blocks each
>>    of size n_m (indexed time->space; this is the reason for the permutation 
>> of
>>    indices). The blocks are partitioned over all processors. Each processor
>>    will get n_t_local blocks of size n_m, where the sum of all n_t_locals for
>>    all processors is n_t.
>>
>> I think the basic idea is similar to the previous example, but these
>> details make things a bit more complicated. Please let me know if anything
>> is unclear, and I can try to explain more.
>>
>> Thanks for your help,
>> Sreeram
>>
>> On Tue, Dec 5, 2023 at 9:30 PM Junchao Zhang <[email protected]>
>> wrote:
>>
>>> I think your approach is correct.  Do you have an example code?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Dec 5, 2023 at 5:15 PM Sreeram R Venkat <[email protected]>
>>> wrote:
>>>
>>>> Hi, I have a follow up question on this.
>>>>
>>>> Now, I'm trying to do a scatter and permutation of the vector. Under
>>>> the same setup as the original example, here are the new Start and Finish
>>>> states I want to achieve:
>>>>  Start                                Finish
>>>> Proc | Entries                 Proc | Entries
>>>>     0   |  0,...,8                     0   | 0, 12, 24
>>>>     1   |  9,...,17                   1   | 1, 13, 25
>>>>     2   |  18,...,26                 2   | 2, 14, 26
>>>>     3   |  27,...,35                 3   | 3, 15, 27
>>>>     4   |  None                     4   | 4, 16, 28
>>>>     5   |  None                     5   | 5, 17, 29
>>>>     6   |  None                     6   | 6, 18, 30
>>>>     7   |  None                     7   | 7, 19, 31
>>>>     8   |  None                     8   | 8, 20, 32
>>>>     9   |  None                     9   | 9, 21, 33
>>>>     10   |  None                   10 | 10, 22, 34
>>>>     11   |  None                   11  | 11, 23, 35
>>>>
>>>> So far, I've tried to use ISCreateGeneral(), with each process giving
>>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks
>>>> like [0,12,24] P1 -> [1,13, 25], and so on).
>>>> Then I use this to create the VecScatter with VecScatterCreate(x, is,
>>>> y, NULL, &scatter).
>>>>
>>>> However, when I try to do the scatter, I get some illegal memory access
>>>> errors.
>>>>
>>>> Is there something wrong with how I define the index sets?
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Oct 5, 2023 at 12:57 PM Sreeram R Venkat <[email protected]>
>>>> wrote:
>>>>
>>>>> Thank you. This works for me.
>>>>>
>>>>> Sreeram
>>>>>
>>>>> On Wed, Oct 4, 2023 at 6:41 PM Junchao Zhang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi, Sreeram,
>>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35],
>>>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index 
>>>>>> sets
>>>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>>>> communication.
>>>>>>
>>>>>> PetscInt rstart, rend, N;
>>>>>> IS ix;
>>>>>> VecScatter vscat;
>>>>>> Vec y;
>>>>>> MPI_Comm comm;
>>>>>> VecType type;
>>>>>>
>>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>>> VecGetType(x, &type);
>>>>>> VecGetSize(x, &N);
>>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>>
>>>>>> VecCreate(comm, &y);
>>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>>> VecSetType(y, type);
>>>>>>
>>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart,
>>>>>> rstart, 1, &ix);
>>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>>
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 4, 2023 at 6:03 PM Sreeram R Venkat <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>>> size 36 partitioned over the first 4. v still uses the 
>>>>>>> PETSC_COMM_WORLD, so
>>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 
>>>>>>> 3, 3,
>>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure
>>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>>
>>>>>>> The result I am trying to achieve is this:
>>>>>>>
>>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>>
>>>>>>>      Start                                Finish
>>>>>>> Proc | Entries                 Proc | Entries
>>>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>>>
>>>>>>> Appreciate any help you can provide on this.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sreeram
>>>>>>>
>>>>>>

Re: [petsc-users] Scattering a vector to/from a subset of processors

Reply via email to