Dear PETSc Team

We are testing the GPU support in PETSc's KSPSolve, especially for the GAMG and 
Hypre preconditioners. We have encountered several issues that we would like to 
ask for your suggestions.

First, we have couple of questions when working with a single MPI rank:

  1.  We have tested two backends, CUDA and Kokkos. One commonly encountered 
error is related to SpGEMM in CUDA when the mat is large as listed below:

cudaMalloc((void **)&buffer2, bufferSize2) error( cudaErrorMemoryAllocation): 
out of memory

For CUDA backend, one can use "-matmatmult_backend_cpu -matptap_backend_cpu" to 
avoid these problems. However, there seems no equivalent options in Kokkos 
backend. Is there any good practice to avoid this error for both backends and 
if we can avoid this error in Kokkos backend?

  2.  We have tested the combination of Hypre and Kokkos as backend. It looks 
like this combination is not compatible with each other, as we observed that 
KSPSolve takes a greater number of iterations to exit, and the residual norm in 
the post-checking is much larger than the one obtained when working with CUDA 
backend. This happens for matrices with block size larger than 1. Is there any 
explanation to the error?

Second, we have couple more questions when working with multiple MPI ranks:

  1.  We are currently using OpenMPI as we couldnt get Intel MPI to work as a 
GPU-aware MPI, is this a known issue with Intel MPI?
  2.  With OpenMPI we currently see a slow down when increasing the MPI count 
as shown in the figure below, is this normal?

[cid:9242808d-34af-4b51-8a0b-8295f0a012e5]

Zisheng

Reply via email to