I am thinking something like MatSeqAIJGetArrayAndMemType > Isn't the "MemType" of the matrix an invariant on creation? e.g. a user shouldn't care what memtype a pointer is for, just that if a device matrix was created it returns device pointers, if a host matrix was created it returns host pointers.
Now that I'm looking at those docs I see MatSeqAIJGetCSRAndMemType <https://petsc.org/release/docs/manualpages/Mat/MatSeqAIJGetCSRAndMemType/>, isn't this what I'm looking for? If I call MatCreateSeqAIJCUSPARSE it will cudaMalloc the csr arrays, and then MatSeqAIJGetCSRAndMemType will return me those raw device pointers? On Thu, Jan 5, 2023 at 11:06 AM Junchao Zhang <[email protected]> wrote: > > > On Thu, Jan 5, 2023 at 9:39 AM Mark Lohry <[email protected]> wrote: > >> >>> A workaround is to let petsc build the matrix and allocate the memory, >>> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up. >>> >> >> Junchao, looking at the code for this it seems to only return a pointer >> to the value array, but not pointers to the column and row index arrays, is >> that right? >> > Yes, that is correct. > I am thinking something like MatSeqAIJGetArrayAndMemType(Mat A, const > PetscInt **i, const PetscInt **j, PetscScalar **a, PetscMemType *mtype), > which returns (a, i, j) on device and mtype = PETSC_MEMTYPE_{CUDA, HIP} if > A is a device matrix, otherwise (a,i, j) on host and mtype = > PETSC_MEMTYPE_HOST. > We currently have similar things like > VecGetArrayAndMemType(Vec,PetscScalar**,PetscMemType*), and I am adding > MatDenseGetArrayAndMemType(Mat,PetscScalar**,PetscMemType*). > > It looks like you need (a, i, j) for assembly, but the above function only > works for an assembled matrix. > > >> >> >> On Thu, Jan 5, 2023 at 5:47 AM Jacob Faibussowitsch <[email protected]> >> wrote: >> >>> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both >>> >>> >>> CUPM works with both enabled simultaneously, I don’t think there are any >>> direct restrictions for it. Vec at least was fully usable with both cuda >>> and hip (though untested) last time I checked. >>> >>> Best regards, >>> >>> Jacob Faibussowitsch >>> (Jacob Fai - booss - oh - vitch) >>> >>> On Jan 5, 2023, at 00:09, Junchao Zhang <[email protected]> wrote: >>> >>> >>> >>> >>> >>> On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <[email protected]> >>> wrote: >>> >>>> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <[email protected]> >>>> wrote: >>>> >>>>> >>>>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <[email protected]> wrote: >>>>> >>>>>> Oh, is the device backend not known at compile time? >>>>>> >>>>> Currently it is known at compile time. >>>>> >>>> >>>> Are you sure? I don't think it is known at compile time. >>>> >>> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both >>> >>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Or multiple backends can be alive at once? >>>>>> >>>>> >>>>> Some petsc developers (Jed and Barry) want to support this, but we are >>>>> incapable now. >>>>> >>>>> >>>>>> >>>>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <[email protected]> wrote: >>>>>>> >>>>>>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then >>>>>>>>> we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE >>>>>>>>> on AMD >>>>>>>>> GPUs, ... >>>>>>>> >>>>>>>> >>>>>>>> Wouldn't one function suffice? Assuming these are contiguous arrays >>>>>>>> in CSR format, they're just raw device pointers in all cases. >>>>>>>> >>>>>>> But we need to know what device it is (to dispatch to either >>>>>>> petsc-CUDA or petsc-HIP backend) >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for >>>>>>>>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but >>>>>>>>> then we >>>>>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on >>>>>>>>> AMD >>>>>>>>> GPUs, ... >>>>>>>>> >>>>>>>>> The real problem I think is to deal with multiple MPI ranks. >>>>>>>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is >>>>>>>>> discouraged for users to do so. >>>>>>>>> >>>>>>>>> A workaround is to let petsc build the matrix and allocate the >>>>>>>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array >>>>>>>>> and fill >>>>>>>>> it up. >>>>>>>>> >>>>>>>>> We recently added routines to support matrix assembly on GPUs, see >>>>>>>>> if MatSetValuesCOO >>>>>>>>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/> >>>>>>>>> helps >>>>>>>>> >>>>>>>>> --Junchao Zhang >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I have a sparse matrix constructed in non-petsc code using a >>>>>>>>>> standard CSR representation where I compute the Jacobian to be used >>>>>>>>>> in an >>>>>>>>>> implicit TS context. In the CPU world I call >>>>>>>>>> >>>>>>>>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, >>>>>>>>>> rowidxptr, colidxptr, valptr, Jac); >>>>>>>>>> >>>>>>>>>> which as I understand it -- (1) never copies/allocates that >>>>>>>>>> information, and the matrix Jac is just a non-owning view into the >>>>>>>>>> already >>>>>>>>>> allocated CSR, (2) I can write directly into the original data >>>>>>>>>> structures >>>>>>>>>> and the Mat just "knows" about it, although it still needs a call to >>>>>>>>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far >>>>>>>>>> this >>>>>>>>>> works great with GAMG. >>>>>>>>>> >>>>>>>>>> I have the same CSR representation filled in GPU data allocated >>>>>>>>>> with cudaMalloc and filled on-device. Is there an equivalent Mat >>>>>>>>>> constructor for GPU arrays, or some other way to avoid unnecessary >>>>>>>>>> copies? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Mark >>>>>>>>>> >>>>>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> <http://www.cse.buffalo.edu/~knepley/> >>>> >>>
