Thank you for your help. It turned out the problem was that I forgot to assemble the "x" vector before doing the scatter. It seems to be working now.
Thanks, Sreeram On Wed, Dec 6, 2023 at 11:12 AM Junchao Zhang <[email protected]> wrote: > Hi, Sreeram, > I made an example with your approach. It worked fine as you see the > output at the end > > #include <petscvec.h> > int main(int argc, char **argv) > { > PetscInt i, j, rstart, rend, n, N, *indices; > PetscMPIInt size, rank; > IS ix; > VecScatter vscat; > Vec x, y; > > PetscFunctionBeginUser; > PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL)); > PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size)); > PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank)); > > PetscCall(VecCreate(PETSC_COMM_WORLD, &x)); > PetscCall(VecSetFromOptions(x)); > PetscCall(PetscObjectSetName((PetscObject)x, "Vec X")); > n = (rank < 4) ? 9 : 0; > PetscCall(VecSetSizes(x, n, PETSC_DECIDE)); > > PetscCall(VecGetOwnershipRange(x, &rstart, &rend)); > for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i, > (PetscScalar)i, INSERT_VALUES)); > PetscCall(VecAssemblyBegin(x)); > PetscCall(VecAssemblyEnd(x)); > PetscCall(VecGetSize(x, &N)); > > PetscCall(VecCreate(PETSC_COMM_WORLD, &y)); > PetscCall(VecSetFromOptions(y)); > PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y")); > PetscCall(VecSetSizes(y, PETSC_DECIDE, N)); > > PetscCall(VecGetOwnershipRange(y, &rstart, &rend)); > PetscCall(PetscMalloc1(rend - rstart, &indices)); > for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j; > > PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices, > PETSC_OWN_POINTER, &ix)); > PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat)); > > PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); > PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD)); > > PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD)); > PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD)); > PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD)); > > PetscCall(VecScatterDestroy(&vscat)); > PetscCall(ISDestroy(&ix)); > PetscCall(VecDestroy(&x)); > PetscCall(VecDestroy(&y)); > PetscCall(PetscFinalize()); > return 0; > } > > $ mpirun -n 12 ./ex100 > IS Object: 12 MPI processes > type: general > [0] Number of indices in set 3 > [0] 0 0 > [0] 1 12 > [0] 2 24 > [1] Number of indices in set 3 > [1] 0 1 > [1] 1 13 > [1] 2 25 > [2] Number of indices in set 3 > [2] 0 2 > [2] 1 14 > [2] 2 26 > [3] Number of indices in set 3 > [3] 0 3 > [3] 1 15 > [3] 2 27 > [4] Number of indices in set 3 > [4] 0 4 > [4] 1 16 > [4] 2 28 > [5] Number of indices in set 3 > [5] 0 5 > [5] 1 17 > [5] 2 29 > [6] Number of indices in set 3 > [6] 0 6 > [6] 1 18 > [6] 2 30 > [7] Number of indices in set 3 > [7] 0 7 > [7] 1 19 > [7] 2 31 > [8] Number of indices in set 3 > [8] 0 8 > [8] 1 20 > [8] 2 32 > [9] Number of indices in set 3 > [9] 0 9 > [9] 1 21 > [9] 2 33 > [10] Number of indices in set 3 > [10] 0 10 > [10] 1 22 > [10] 2 34 > [11] Number of indices in set 3 > [11] 0 11 > [11] 1 23 > [11] 2 35 > Vec Object: Vec X 12 MPI processes > type: mpi > Process [0] > 0. > 1. > 2. > 3. > 4. > 5. > 6. > 7. > 8. > Process [1] > 9. > 10. > 11. > 12. > 13. > 14. > 15. > 16. > 17. > Process [2] > 18. > 19. > 20. > 21. > 22. > 23. > 24. > 25. > 26. > Process [3] > 27. > 28. > 29. > 30. > 31. > 32. > 33. > 34. > 35. > Process [4] > Process [5] > Process [6] > Process [7] > Process [8] > Process [9] > Process [10] > Process [11] > Vec Object: Vec Y 12 MPI processes > type: mpi > Process [0] > 0. > 12. > 24. > Process [1] > 1. > 13. > 25. > Process [2] > 2. > 14. > 26. > Process [3] > 3. > 15. > 27. > Process [4] > 4. > 16. > 28. > Process [5] > 5. > 17. > 29. > Process [6] > 6. > 18. > 30. > Process [7] > 7. > 19. > 31. > Process [8] > 8. > 20. > 32. > Process [9] > 9. > 21. > 33. > Process [10] > 10. > 22. > 34. > Process [11] > 11. > 23. > 35. > > --Junchao Zhang > > > On Tue, Dec 5, 2023 at 10:09 PM Sreeram R Venkat <[email protected]> > wrote: > >> Yes, I have an example code at github.com/s769/petsc-test. Only thing >> is, when I described the example before, I simplified the actual use case >> in the code to make things simpler. Here are the extra details relevant to >> this code: >> >> - We assume a 2D processor grid, given by the command-line args >> -proc_rows and -proc_cols >> - The total length of the vector is n_m*n_t given by command-line >> args -nm and -nt. n_m corresponds to a space index and n_t a time index. >> - In the "Start" phase, the vector is divided into n_m blocks each of >> size n_t (indexed space->time). The blocks are partitioned over the first >> row of processors. For example if -nm = 4 and -proc_cols = 4, each >> processor in the first row will get one block of size n_t. Each processor >> in the first row gets n_m_local blocks of size n_t, where the sum of all >> n_m_locals for the first row of processors is n_m. >> - In the "Finish" phase, the vector is divided into n_t blocks each >> of size n_m (indexed time->space; this is the reason for the permutation >> of >> indices). The blocks are partitioned over all processors. Each processor >> will get n_t_local blocks of size n_m, where the sum of all n_t_locals for >> all processors is n_t. >> >> I think the basic idea is similar to the previous example, but these >> details make things a bit more complicated. Please let me know if anything >> is unclear, and I can try to explain more. >> >> Thanks for your help, >> Sreeram >> >> On Tue, Dec 5, 2023 at 9:30 PM Junchao Zhang <[email protected]> >> wrote: >> >>> I think your approach is correct. Do you have an example code? >>> >>> --Junchao Zhang >>> >>> >>> On Tue, Dec 5, 2023 at 5:15 PM Sreeram R Venkat <[email protected]> >>> wrote: >>> >>>> Hi, I have a follow up question on this. >>>> >>>> Now, I'm trying to do a scatter and permutation of the vector. Under >>>> the same setup as the original example, here are the new Start and Finish >>>> states I want to achieve: >>>> Start Finish >>>> Proc | Entries Proc | Entries >>>> 0 | 0,...,8 0 | 0, 12, 24 >>>> 1 | 9,...,17 1 | 1, 13, 25 >>>> 2 | 18,...,26 2 | 2, 14, 26 >>>> 3 | 27,...,35 3 | 3, 15, 27 >>>> 4 | None 4 | 4, 16, 28 >>>> 5 | None 5 | 5, 17, 29 >>>> 6 | None 6 | 6, 18, 30 >>>> 7 | None 7 | 7, 19, 31 >>>> 8 | None 8 | 8, 20, 32 >>>> 9 | None 9 | 9, 21, 33 >>>> 10 | None 10 | 10, 22, 34 >>>> 11 | None 11 | 11, 23, 35 >>>> >>>> So far, I've tried to use ISCreateGeneral(), with each process giving >>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks >>>> like [0,12,24] P1 -> [1,13, 25], and so on). >>>> Then I use this to create the VecScatter with VecScatterCreate(x, is, >>>> y, NULL, &scatter). >>>> >>>> However, when I try to do the scatter, I get some illegal memory access >>>> errors. >>>> >>>> Is there something wrong with how I define the index sets? >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Oct 5, 2023 at 12:57 PM Sreeram R Venkat <[email protected]> >>>> wrote: >>>> >>>>> Thank you. This works for me. >>>>> >>>>> Sreeram >>>>> >>>>> On Wed, Oct 4, 2023 at 6:41 PM Junchao Zhang <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, Sreeram, >>>>>> You can try this code. Since x, y are both MPI vectors, we just need >>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index >>>>>> sets >>>>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>>>> communication. >>>>>> >>>>>> PetscInt rstart, rend, N; >>>>>> IS ix; >>>>>> VecScatter vscat; >>>>>> Vec y; >>>>>> MPI_Comm comm; >>>>>> VecType type; >>>>>> >>>>>> PetscObjectGetComm((PetscObject)x, &comm); >>>>>> VecGetType(x, &type); >>>>>> VecGetSize(x, &N); >>>>>> VecGetOwnershipRange(x, &rstart, &rend); >>>>>> >>>>>> VecCreate(comm, &y); >>>>>> VecSetSizes(y, PETSC_DECIDE, N); >>>>>> VecSetType(y, type); >>>>>> >>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, >>>>>> rstart, 1, &ix); >>>>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Wed, Oct 4, 2023 at 6:03 PM Sreeram R Venkat <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of >>>>>>> size 36 partitioned over the first 4. v still uses the >>>>>>> PETSC_COMM_WORLD, so >>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>>>> repartition it over all 12 processors, so that the layout becomes (3, >>>>>>> 3, 3, >>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>>>> what IndexSets to use for the sender and receiver. >>>>>>> >>>>>>> The result I am trying to achieve is this: >>>>>>> >>>>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>>>> >>>>>>> Start Finish >>>>>>> Proc | Entries Proc | Entries >>>>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>>>> 4 | None 4 | 12, 13, 14 >>>>>>> 5 | None 5 | 15, 16, 17 >>>>>>> 6 | None 6 | 18, 19, 20 >>>>>>> 7 | None 7 | 21, 22, 23 >>>>>>> 8 | None 8 | 24, 25, 26 >>>>>>> 9 | None 9 | 27, 28, 29 >>>>>>> 10 | None 10 | 30, 31, 32 >>>>>>> 11 | None 11 | 33, 34, 35 >>>>>>> >>>>>>> Appreciate any help you can provide on this. >>>>>>> >>>>>>> Thanks, >>>>>>> Sreeram >>>>>>> >>>>>>
