On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <[email protected]> wrote:
> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <[email protected]> > wrote: > >> >> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <[email protected]> wrote: >> >>> Oh, is the device backend not known at compile time? >>> >> Currently it is known at compile time. >> > > Are you sure? I don't think it is known at compile time. > We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both > > Thanks, > > Matt > > >> Or multiple backends can be alive at once? >>> >> >> Some petsc developers (Jed and Barry) want to support this, but we are >> incapable now. >> >> >>> >>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <[email protected]> wrote: >>>> >>>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we >>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD >>>>>> GPUs, ... >>>>> >>>>> >>>>> Wouldn't one function suffice? Assuming these are contiguous arrays in >>>>> CSR format, they're just raw device pointers in all cases. >>>>> >>>> But we need to know what device it is (to dispatch to either petsc-CUDA >>>> or petsc-HIP backend) >>>> >>>> >>>>> >>>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <[email protected]> >>>>> wrote: >>>>> >>>>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for >>>>>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then >>>>>> we >>>>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD >>>>>> GPUs, ... >>>>>> >>>>>> The real problem I think is to deal with multiple MPI ranks. >>>>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is >>>>>> discouraged for users to do so. >>>>>> >>>>>> A workaround is to let petsc build the matrix and allocate the >>>>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and >>>>>> fill >>>>>> it up. >>>>>> >>>>>> We recently added routines to support matrix assembly on GPUs, see if >>>>>> MatSetValuesCOO >>>>>> <https://petsc.org/release/docs/manualpages/Mat/MatSetValuesCOO/> >>>>>> helps >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <[email protected]> wrote: >>>>>> >>>>>>> I have a sparse matrix constructed in non-petsc code using a >>>>>>> standard CSR representation where I compute the Jacobian to be used in >>>>>>> an >>>>>>> implicit TS context. In the CPU world I call >>>>>>> >>>>>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr, >>>>>>> colidxptr, valptr, Jac); >>>>>>> >>>>>>> which as I understand it -- (1) never copies/allocates that >>>>>>> information, and the matrix Jac is just a non-owning view into the >>>>>>> already >>>>>>> allocated CSR, (2) I can write directly into the original data >>>>>>> structures >>>>>>> and the Mat just "knows" about it, although it still needs a call to >>>>>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this >>>>>>> works great with GAMG. >>>>>>> >>>>>>> I have the same CSR representation filled in GPU data allocated with >>>>>>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor >>>>>>> for >>>>>>> GPU arrays, or some other way to avoid unnecessary copies? >>>>>>> >>>>>>> Thanks, >>>>>>> Mark >>>>>>> >>>>>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> >
