If you use cudaMallocManaged with host affinity, you can drop that into
PETSc malloc and it should “just work” including migrating to GPU when
touched. Or you can give it device affinity and it will migrate the other
way when the CPU touches it.

This is way more performance portable that system managed memory on the
Summit/Lassen systems, which can do unpleasant things unless you disable
NUMA balancing and use CUDA prefetch.

Jeff

On Wed, Sep 2, 2020 at 10:49 AM Mark Adams <mfad...@lbl.gov> wrote:

> OK good to know. I will now worry even less about making this very
> complete.
>
> On Wed, Sep 2, 2020 at 1:33 PM Barry Smith <bsm...@petsc.dev> wrote:
>
>>
>>
>>
>>   Mark,
>>
>>
>>
>>
>>
>>    Currently you use directly the Nvidia provided mallocs cudaMalloc for
>> all mallocs on the GPU. See for example aijcusparse.cu.
>>
>>
>>
>>
>>
>>    I will be using Stefano's work to start developing a unified PETSc
>> based system for all memory management but don't wait for that.
>>
>>
>>
>>
>>
>>    Barry
>>
>>
>>
>>
>>
>>
>>
>>
>> > On Sep 2, 2020, at 8:58 AM, Mark Adams <mfad...@lbl.gov> wrote:
>>
>>
>> >
>>
>>
>> > PETSc mallocs seem to boil down to PetscMallocAlign. There are switches
>> in here but I don't see a Cuda malloc. THis would seem to be convenient if
>> I want to create an Object entirely on Cuda or any device.
>>
>>
>> >
>>
>>
>> > Are there any thoughts along these lines or should I just duplicate Mat
>> creation, for instance, by hand?
>>
>>
>>
>>
>>
>>
>
> --
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/

Reply via email to