Yes, I have an example code at github.com/s769/petsc-test. Only thing is, when I described the example before, I simplified the actual use case in the code to make things simpler. Here are the extra details relevant to this code:
- We assume a 2D processor grid, given by the command-line args -proc_rows and -proc_cols - The total length of the vector is n_m*n_t given by command-line args -nm and -nt. n_m corresponds to a space index and n_t a time index. - In the "Start" phase, the vector is divided into n_m blocks each of size n_t (indexed space->time). The blocks are partitioned over the first row of processors. For example if -nm = 4 and -proc_cols = 4, each processor in the first row will get one block of size n_t. Each processor in the first row gets n_m_local blocks of size n_t, where the sum of all n_m_locals for the first row of processors is n_m. - In the "Finish" phase, the vector is divided into n_t blocks each of size n_m (indexed time->space; this is the reason for the permutation of indices). The blocks are partitioned over all processors. Each processor will get n_t_local blocks of size n_m, where the sum of all n_t_locals for all processors is n_t. I think the basic idea is similar to the previous example, but these details make things a bit more complicated. Please let me know if anything is unclear, and I can try to explain more. Thanks for your help, Sreeram On Tue, Dec 5, 2023 at 9:30 PM Junchao Zhang <[email protected]> wrote: > I think your approach is correct. Do you have an example code? > > --Junchao Zhang > > > On Tue, Dec 5, 2023 at 5:15 PM Sreeram R Venkat <[email protected]> > wrote: > >> Hi, I have a follow up question on this. >> >> Now, I'm trying to do a scatter and permutation of the vector. Under the >> same setup as the original example, here are the new Start and Finish >> states I want to achieve: >> Start Finish >> Proc | Entries Proc | Entries >> 0 | 0,...,8 0 | 0, 12, 24 >> 1 | 9,...,17 1 | 1, 13, 25 >> 2 | 18,...,26 2 | 2, 14, 26 >> 3 | 27,...,35 3 | 3, 15, 27 >> 4 | None 4 | 4, 16, 28 >> 5 | None 5 | 5, 17, 29 >> 6 | None 6 | 6, 18, 30 >> 7 | None 7 | 7, 19, 31 >> 8 | None 8 | 8, 20, 32 >> 9 | None 9 | 9, 21, 33 >> 10 | None 10 | 10, 22, 34 >> 11 | None 11 | 11, 23, 35 >> >> So far, I've tried to use ISCreateGeneral(), with each process giving an >> idx array corresponding to the indices it wants (i.e. idx on P0 looks like >> [0,12,24] P1 -> [1,13, 25], and so on). >> Then I use this to create the VecScatter with VecScatterCreate(x, is, y, >> NULL, &scatter). >> >> However, when I try to do the scatter, I get some illegal memory access >> errors. >> >> Is there something wrong with how I define the index sets? >> >> Thanks, >> Sreeram >> >> >> >> >> >> On Thu, Oct 5, 2023 at 12:57 PM Sreeram R Venkat <[email protected]> >> wrote: >> >>> Thank you. This works for me. >>> >>> Sreeram >>> >>> On Wed, Oct 4, 2023 at 6:41 PM Junchao Zhang <[email protected]> >>> wrote: >>> >>>> Hi, Sreeram, >>>> You can try this code. Since x, y are both MPI vectors, we just need to >>>> say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your >>>> example on the 12 processes would be [0..8], [9..17], [18..26], [27..35], >>>> [], ..., []. Actually, you can do it arbitrarily, say, with 12 index sets >>>> [0..17], [18..35], .., []. PETSc will figure out how to do the >>>> communication. >>>> >>>> PetscInt rstart, rend, N; >>>> IS ix; >>>> VecScatter vscat; >>>> Vec y; >>>> MPI_Comm comm; >>>> VecType type; >>>> >>>> PetscObjectGetComm((PetscObject)x, &comm); >>>> VecGetType(x, &type); >>>> VecGetSize(x, &N); >>>> VecGetOwnershipRange(x, &rstart, &rend); >>>> >>>> VecCreate(comm, &y); >>>> VecSetSizes(y, PETSC_DECIDE, N); >>>> VecSetType(y, type); >>>> >>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart, rstart, >>>> 1, &ix); >>>> VecScatterCreate(x, ix, y, ix, &vscat); >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Oct 4, 2023 at 6:03 PM Sreeram R Venkat <[email protected]> >>>> wrote: >>>> >>>>> Suppose I am running on 12 processors, and I have a vector "v" of size >>>>> 36 partitioned over the first 4. v still uses the PETSC_COMM_WORLD, so it >>>>> has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to >>>>> repartition it over all 12 processors, so that the layout becomes (3, 3, >>>>> 3, >>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not sure >>>>> what IndexSets to use for the sender and receiver. >>>>> >>>>> The result I am trying to achieve is this: >>>>> >>>>> Assume the vector is v = <0, 1, 2, ..., 35> >>>>> >>>>> Start Finish >>>>> Proc | Entries Proc | Entries >>>>> 0 | 0,...,8 0 | 0, 1, 2 >>>>> 1 | 9,...,17 1 | 3, 4, 5 >>>>> 2 | 18,...,26 2 | 6, 7, 8 >>>>> 3 | 27,...,35 3 | 9, 10, 11 >>>>> 4 | None 4 | 12, 13, 14 >>>>> 5 | None 5 | 15, 16, 17 >>>>> 6 | None 6 | 18, 19, 20 >>>>> 7 | None 7 | 21, 22, 23 >>>>> 8 | None 8 | 24, 25, 26 >>>>> 9 | None 9 | 27, 28, 29 >>>>> 10 | None 10 | 30, 31, 32 >>>>> 11 | None 11 | 33, 34, 35 >>>>> >>>>> Appreciate any help you can provide on this. >>>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>
