The reason I want to use shell matrix is to reduce memory footprint. If I create a PETSc matrix and use MUMPS, I understand PETSc will create another copy of the matrix for MUMPS. Is there any way to avoid the extra copy of MUMPS?
On Tue, Mar 22, 2022 at 12:53 PM Sam Guo <[email protected]> wrote: > Barry, > Thanks for the illustration. Is there an easy way to mimic the > implementation using shell matrix? I have been studying how the sMvctx is > created and it seems pretty involved. > > Thanks, > Sam > > On Mon, Mar 21, 2022 at 2:48 PM Barry Smith <[email protected]> wrote: > >> >> >> On Mar 21, 2022, at 4:36 PM, Sam Guo <[email protected]> wrote: >> >> Barry, >> Thanks. Could you elaborate? I try to implement the matrix-vector >> multiplication for a symmetric matrix using shell matrix. >> >> >> Consider with three ranks >> >> (a) = ( A B D) (x) >> (b) ( B' C E) (y) >> (c) ( D' E' F) (w) >> >> Only the ones without the ' are stored on the rank. So for example >> B is stored on rank 0. >> >> Rank 0 computes A x and keeps it in a. Rank 1 computes Cy and >> keeps it in b Rank 2 computes Fw and keeps it in c >> >> Rank 0 computes B'x and D'x. It puts the nonzero entries of these >> values as well as the values of x into slvec0 >> >> Rank 1 computes E'y and puts the nonzero entries as well as the >> values into slvec0 >> >> Rank 2 puts the values of we needed by the other ranks into slvec0 >> >> Rank 0 does B y_h + D z_h where it gets the y_h and z_h values >> from slvec1 and adds it to a >> >> Rank 1 takes the B'x from slvec1 and adds it to b it then takes >> the E y_h values where the y_h are pulled from slvec1 and adds them b >> >> Rank 2 takes the B'x and E'y from slvec0 and adds them to c. >> >> >> >> Thanks, >> Sam >> >> On Mon, Mar 21, 2022 at 12:56 PM Barry Smith <[email protected]> wrote: >> >>> >>> The "trick" is that though "more" communication is needed to complete >>> the product the communication can still be done in a single VecScatter >>> instead of two separate calls to VecScatter. We simply pack both pieces of >>> information that needs to be sent into a single vector. >>> >>> /* copy x into the vec slvec0 */1111: VecGetArray >>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArray.html#VecGetArray>(a->slvec0,&from);1112: >>> VecGetArrayRead >>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArrayRead.html#VecGetArrayRead>(xx,&x); >>> 1114: PetscArraycpy >>> <https://petsc.org/main/docs/manualpages/Sys/PetscArraycpy.html#PetscArraycpy>(from,x,bs*mbs);1115: >>> VecRestoreArray >>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArray.html#VecRestoreArray>(a->slvec0,&from);1116: >>> VecRestoreArrayRead >>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArrayRead.html#VecRestoreArrayRead>(xx,&x); >>> 1118: VecScatterBegin >>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterBegin.html#VecScatterBegin>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>> >>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>> >>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1119: >>> VecScatterEnd >>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterEnd.html#VecScatterEnd>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>> >>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>> >>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>); >>> >>> If you create two symmetric matrices, one with SBAIJ and one with BAIJ >>> and compare the time to do the product you will find that the SBAIJ is not >>> significantly slower but does save memory. >>> >>> >>> On Mar 21, 2022, at 3:26 PM, Sam Guo <[email protected]> wrote: >>> >>> Using following example from the MatCreateSBAIJ documentation >>> >>> 0 1 2 3 4 5 6 7 8 9 10 11 >>> -------------------------- >>> row 3 |. . . d d d o o o o o o >>> row 4 |. . . d d d o o o o o o >>> row 5 |. . . d d d o o o o o o >>> -------------------------- >>> >>> >>> On a processor that owns rows 3, 4 and 5, rows 0-2 info are still needed. >>> Is is that the processor that owns rows 0-2 will apply B symmetrical and >>> then send the result >>> >>> to the processor that owns 3-5? >>> >>> >>> On Mon, Mar 21, 2022 at 12:14 PM Mark Adams <[email protected]> wrote: >>> >>>> PETSc stores parallel matrices as two serial matrices. One for the >>>> diagonal (d or A) block and one for the rest (o or B). >>>> I would guess that for symmetric matrices it has a symmetric matrix for >>>> the diagonal and a full AIJ matrix for the (upper) off-diagonal. >>>> So the multtranspose is applying B symmetrically. This lower >>>> off-diagonal and the diagonal block can be done without communication. >>>> Then the off processor values are collected, and the upper off-diagonal >>>> is applied. >>>> >>>> On Mon, Mar 21, 2022 at 2:35 PM Sam Guo <[email protected]> wrote: >>>> >>>>> I am most interested in how the lower triangular part is >>>>> redistributed. It seems that SBAJI saves memory but requires more >>>>> communication than BAIJ. >>>>> >>>>> On Mon, Mar 21, 2022 at 11:27 AM Sam Guo <[email protected]> >>>>> wrote: >>>>> >>>>>> Mark, thanks for the quick response. I am more interested in parallel >>>>>> implementation of MatMult for SBAIJ. I found following >>>>>> >>>>>> 1094: *PetscErrorCode >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode> >>>>>> MatMult_MPISBAIJ(Mat >>>>>> <https://petsc.org/main/docs/manualpages/Mat/Mat.html#Mat> A,Vec >>>>>> <https://petsc.org/main/docs/manualpages/Vec/Vec.html#Vec> xx,Vec >>>>>> <https://petsc.org/main/docs/manualpages/Vec/Vec.html#Vec> yy)*1095: >>>>>> {1096: Mat_MPISBAIJ *a = (Mat_MPISBAIJ*)A->data;1097: >>>>>> PetscErrorCode >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode> >>>>>> ierr;1098: PetscInt >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscInt.html#PetscInt> >>>>>> mbs=a->mbs,bs=A->rmap->bs;1099: PetscScalar >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscScalar.html#PetscScalar> >>>>>> *from;1100: const PetscScalar >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscScalar.html#PetscScalar> >>>>>> *x; >>>>>> 1103: /* diagonal part */1104: >>>>>> (*a->A->ops->mult)(a->A,xx,a->slvec1a);1105: VecSet >>>>>> <https://petsc.org/main/docs/manualpages/Vec/VecSet.html#VecSet>(a->slvec1b,0.0); >>>>>> 1107: /* subdiagonal part */1108: >>>>>> (*a->B->ops->multtranspose)(a->B,xx,a->slvec0b); >>>>>> 1110: /* copy x into the vec slvec0 */1111: VecGetArray >>>>>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArray.html#VecGetArray>(a->slvec0,&from);1112: >>>>>> VecGetArrayRead >>>>>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArrayRead.html#VecGetArrayRead>(xx,&x); >>>>>> 1114: PetscArraycpy >>>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscArraycpy.html#PetscArraycpy>(from,x,bs*mbs);1115: >>>>>> VecRestoreArray >>>>>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArray.html#VecRestoreArray>(a->slvec0,&from);1116: >>>>>> VecRestoreArrayRead >>>>>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArrayRead.html#VecRestoreArrayRead>(xx,&x); >>>>>> 1118: VecScatterBegin >>>>>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterBegin.html#VecScatterBegin>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>>>>> >>>>>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>>>>> >>>>>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1119: >>>>>> VecScatterEnd >>>>>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterEnd.html#VecScatterEnd>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>>>>> >>>>>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>>>>> >>>>>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1120: >>>>>> /* supperdiagonal part */1121: >>>>>> (*a->B->ops->multadd)(a->B,a->slvec1b,a->slvec1a,yy);1122: >>>>>> return(0);1123: } >>>>>> >>>>>> I try to understand the algorithm. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Sam >>>>>> >>>>>> >>>>>> On Mon, Mar 21, 2022 at 11:14 AM Mark Adams <[email protected]> wrote: >>>>>> >>>>>>> This code looks fine to me and the code is >>>>>>> in src/mat/impls/sbaij/seq/sbaij2.c >>>>>>> >>>>>>> On Mon, Mar 21, 2022 at 2:02 PM Sam Guo <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Dear PETSc dev team, >>>>>>>> The documentation about MatCreateSBAIJ has following >>>>>>>> "It is recommended that one use the MatCreate >>>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatCreate.html#MatCreate> >>>>>>>> (), MatSetType >>>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSetType.html#MatSetType>() >>>>>>>> and/or MatSetFromOptions >>>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSetFromOptions.html#MatSetFromOptions>(), >>>>>>>> MatXXXXSetPreallocation() paradigm instead of this routine directly. >>>>>>>> [MatXXXXSetPreallocation() is, for example, >>>>>>>> MatSeqAIJSetPreallocation >>>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSeqAIJSetPreallocation.html#MatSeqAIJSetPreallocation> >>>>>>>> ]" >>>>>>>> I currently call MatCreateSBAIJ directly as follows: >>>>>>>> MatCreateSBAIJ (with d_nnz and o_nnz) >>>>>>>> MatSetValues (to add row by row) >>>>>>>> MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); >>>>>>>> MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); >>>>>>>> MatSetOption(A, MAT_SYMMETRIC, PETSC_TRUE); >>>>>>>> >>>>>>>> Two questions: >>>>>>>> (1) I am wondering whether what I am doing is the >>>>>>>> most efficient. >>>>>>>> >>>>>>>> (2) I try to find out how the matrix vector multiplication is >>>>>>>> implemented in PETSc for SBAIJ storage. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sam >>>>>>>> >>>>>>> >>> >>
