On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <mlo...@gmail.com> wrote: > Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we >> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD >> GPUs, ... > > > Wouldn't one function suffice? Assuming these are contiguous arrays in CSR > format, they're just raw device pointers in all cases. > But we need to know what device it is (to dispatch to either petsc-CUDA or petsc-HIP backend)
> > On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <junchao.zh...@gmail.com> > wrote: > >> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs. >> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would >> need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ... >> >> The real problem I think is to deal with multiple MPI ranks. Providing >> the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged >> for users to do so. >> >> A workaround is to let petsc build the matrix and allocate the memory, >> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up. >> >> We recently added routines to support matrix assembly on GPUs, see if >> MatSetValuesCOO >> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/> helps >> >> --Junchao Zhang >> >> >> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <mlo...@gmail.com> wrote: >> >>> I have a sparse matrix constructed in non-petsc code using a standard >>> CSR representation where I compute the Jacobian to be used in an implicit >>> TS context. In the CPU world I call >>> >>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr, >>> colidxptr, valptr, Jac); >>> >>> which as I understand it -- (1) never copies/allocates that information, >>> and the matrix Jac is just a non-owning view into the already allocated >>> CSR, (2) I can write directly into the original data structures and the Mat >>> just "knows" about it, although it still needs a call to >>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this >>> works great with GAMG. >>> >>> I have the same CSR representation filled in GPU data allocated with >>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for >>> GPU arrays, or some other way to avoid unnecessary copies? >>> >>> Thanks, >>> Mark >>> >>