Re: [petsc-dev] PetscMallocAlign for Cuda

2020-09-03 Thread Jeff Hammond
If you use cudaMallocManaged with host affinity, you can drop that into PETSc malloc and it should “just work” including migrating to GPU when touched. Or you can give it device affinity and it will migrate the other way when the CPU touches it. This is way more performance portable that system

Re: [petsc-dev] PetscMallocAlign for Cuda

2020-09-02 Thread Mark Adams
OK good to know. I will now worry even less about making this very complete. On Wed, Sep 2, 2020 at 1:33 PM Barry Smith wrote: > > Mark, > >Currently you use directly the Nvidia provided mallocs cudaMalloc for > all mallocs on the GPU. See for example aijcusparse.cu. > >I will be

Re: [petsc-dev] PetscMallocAlign for Cuda

2020-09-02 Thread Barry Smith
Mark, Currently you use directly the Nvidia provided mallocs cudaMalloc for all mallocs on the GPU. See for example aijcusparse.cu. I will be using Stefano's work to start developing a unified PETSc based system for all memory management but don't wait for that. Barry > On Sep

Re: [petsc-dev] PetscMallocAlign for Cuda

2020-09-02 Thread Jacob Faibussowitsch
I believe there are a few PetscMallocCuda impls in src/sys/memory/cuda/mcudahost.cu that seem to do what you are describing. If you are creating mats you can also consider cudaMallocPitch, but I’m not sure how that plays with the sparse storage impls that petsc mat uses. Seems more useful for

[petsc-dev] PetscMallocAlign for Cuda

2020-09-02 Thread Mark Adams
PETSc mallocs seem to boil down to PetscMallocAlign. There are switches in here but I don't see a Cuda malloc. THis would seem to be convenient if I want to create an Object entirely on Cuda or any device. Are there any thoughts along these lines or should I just duplicate Mat creation, for