Start by running a good size problem with -log_view (one MPI rank is best for 
all initial studies) and see the performance and the parts on the GPU. For a 
GPU your minimum problem size should be about 1 million unknowns!

  Feel free to send the -log_view output.

  Barry



> On Jan 14, 2022, at 4:27 PM, Rohan Yadav <roh...@alumni.cmu.edu> wrote:
> 
> Hi,
> 
> I'm looking to use PETSc with GPUs to do some linear algebra operations, like 
> SpMV, SPMM etc. Building PETSc with `--with-cuda=1` and running with 
> `-mat_type aijcusparse -vec_type cuda` gives me a large slowdown from the 
> same code running on the CPU. This is not entirely unexpected, as things like 
> data transfer costs across the PCIE might erroneously be included in my 
> timing. Are there some examples of benchmarking GPU computations with PETSc, 
> or just the proper way to write code in PETSc that will work for CPUs and 
> GPUs?
> 
> Rohan

Reply via email to