We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both 

CUPM works with both enabled simultaneously, I don’t think there are any direct restrictions for it. Vec at least was fully usable with both cuda and hip (though untested) last time I checked.
 
Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)

On Jan 5, 2023, at 00:09, Junchao Zhang <[email protected]> wrote:





On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley <[email protected]> wrote:
On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang <[email protected]> wrote:

On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry <[email protected]> wrote:
Oh, is the device backend not known at compile time?
Currently it is known at compile time.

Are you sure? I don't think it is known at compile time.
We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both 
 

  Thanks,

     Matt
 
Or multiple backends can be alive at once?

Some petsc developers (Jed and Barry) want to support this, but we are incapable now.
 

On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang <[email protected]> wrote:


On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry <[email protected]> wrote:
Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...

Wouldn't one function suffice? Assuming these are contiguous arrays in CSR format, they're just raw device pointers in all cases.
But we need to know what device it is (to dispatch to either petsc-CUDA or petsc-HIP backend)
 

On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <[email protected]> wrote:
No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...

The real problem I think is to deal with multiple MPI ranks. Providing the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged for users to do so.

A workaround is to let petsc build the matrix and allocate the memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.

We recently added routines to support matrix assembly on GPUs, see if MatSetValuesCOO helps

--Junchao Zhang


On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry <[email protected]> wrote:
I have a sparse matrix constructed in non-petsc code using a standard CSR representation where I compute the Jacobian to be used in an implicit TS context. In the CPU world I call

MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr, colidxptr, valptr, Jac);

which as I understand it -- (1) never copies/allocates that information, and the matrix Jac is just a non-owning view into the already allocated CSR, (2) I can write directly into the original data structures and the Mat just "knows" about it, although it still needs a call to MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this works great with GAMG.

I have the same CSR representation filled in GPU data allocated with cudaMalloc and filled on-device. Is there an equivalent Mat constructor for GPU arrays, or some other way to avoid unnecessary copies?

Thanks,
Mark


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

Reply via email to