> > > A workaround is to let petsc build the matrix and allocate the memory, > then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up. >
Junchao, looking at the code for this it seems to only return a pointer to the value array, but not pointers to the column and row index arrays, is that right? On Thu, Jan 5, 2023 at 5:47 AM Jacob Faibussowitsch <jacob....@gmail.com> wrote: > We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both > > > CUPM works with both enabled simultaneously, I don’t think there are any > direct restrictions for it. Vec at least was fully usable with both cuda > and hip (though untested) last time I checked. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > On Jan 5, 2023, at 00:09, Junchao Zhang <junchao.zh...@gmail.com> wrote: > > > > > > On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <knep...@gmail.com> wrote: > >> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <junchao.zh...@gmail.com> >> wrote: >> >>> >>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <mlo...@gmail.com> wrote: >>> >>>> Oh, is the device backend not known at compile time? >>>> >>> Currently it is known at compile time. >>> >> >> Are you sure? I don't think it is known at compile time. >> > We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both > > >> >> Thanks, >> >> Matt >> >> >>> Or multiple backends can be alive at once? >>>> >>> >>> Some petsc developers (Jed and Barry) want to support this, but we are >>> incapable now. >>> >>> >>>> >>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <junchao.zh...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <mlo...@gmail.com> wrote: >>>>> >>>>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we >>>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD >>>>>>> GPUs, ... >>>>>> >>>>>> >>>>>> Wouldn't one function suffice? Assuming these are contiguous arrays >>>>>> in CSR format, they're just raw device pointers in all cases. >>>>>> >>>>> But we need to know what device it is (to dispatch to either >>>>> petsc-CUDA or petsc-HIP backend) >>>>> >>>>> >>>>>> >>>>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <junchao.zh...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for >>>>>>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but >>>>>>> then we >>>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD >>>>>>> GPUs, ... >>>>>>> >>>>>>> The real problem I think is to deal with multiple MPI ranks. >>>>>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is >>>>>>> discouraged for users to do so. >>>>>>> >>>>>>> A workaround is to let petsc build the matrix and allocate the >>>>>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and >>>>>>> fill >>>>>>> it up. >>>>>>> >>>>>>> We recently added routines to support matrix assembly on GPUs, see if >>>>>>> MatSetValuesCOO >>>>>>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/> >>>>>>> helps >>>>>>> >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <mlo...@gmail.com> wrote: >>>>>>> >>>>>>>> I have a sparse matrix constructed in non-petsc code using a >>>>>>>> standard CSR representation where I compute the Jacobian to be used in >>>>>>>> an >>>>>>>> implicit TS context. In the CPU world I call >>>>>>>> >>>>>>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, >>>>>>>> rowidxptr, colidxptr, valptr, Jac); >>>>>>>> >>>>>>>> which as I understand it -- (1) never copies/allocates that >>>>>>>> information, and the matrix Jac is just a non-owning view into the >>>>>>>> already >>>>>>>> allocated CSR, (2) I can write directly into the original data >>>>>>>> structures >>>>>>>> and the Mat just "knows" about it, although it still needs a call to >>>>>>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this >>>>>>>> works great with GAMG. >>>>>>>> >>>>>>>> I have the same CSR representation filled in GPU data allocated >>>>>>>> with cudaMalloc and filled on-device. Is there an equivalent Mat >>>>>>>> constructor for GPU arrays, or some other way to avoid unnecessary >>>>>>>> copies? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mark >>>>>>>> >>>>>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> <http://www.cse.buffalo.edu/~knepley/> >> >