"Paul T. Bauman" <[email protected]> writes: > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman <[email protected]> wrote: >> Yes. The way HYPRE's memory model is setup is that ALL GPU allocations are >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, then ALL >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >> Regarding HIP, there is an HMM implementation of hipMallocManaged planned, >> but is it not yet delivered AFAIK (and it will *not* support gfx906, e.g. >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >> hipHostMalloc. So, today, all your unified memory allocations in HYPRE on >> HIP are doing CPU-pinned memory accesses. And performance is just truly >> terrible (as you might expect).
Thanks for this important bit of information. And it sounds like when we add support to hand off Kokkos matrices and vectors (our current support for matrices on ROCm devices uses Kokkos) or add direct support for hipSparse, we'll avoid touching host memory in assembly-to-solve with hypre.
