Re: [petsc-dev] MatMult on Summit

Karl Rupp via petsc-dev Sat, 21 Sep 2019 22:01:42 -0700



On 9/22/19 6:15 AM, Jed Brown wrote:

Karl Rupp via petsc-dev <petsc-dev@mcs.anl.gov> writes:

Hi Junchao,

thanks, these numbers are interesting.

Do you have an easy way to evaluate the benefits of a CUDA-aware MPI vs.
a non-CUDA-aware MPI that still keeps the benefits of your
packing/unpacking routines?

I'd like to get a feeling of where the performance gains come from. Is
it due to the reduced PCI-Express transfer


It's NVLink, not PCI-express.


Indeed.

I wonder if the single-node latency bugs on AC922 are related to these
weird performance results.

https://docs.google.com/spreadsheets/d/1amFJIbpvs9oJcUc-WntsFHO_C0LE7xFJeor-oElt0LY/edit#gid=0


Thanks for these numbers!

Intra-Node > Inter-Node is indeed weird. I haven't observed such aninversion before.


Best regards,
Karli

Re: [petsc-dev] MatMult on Summit

Reply via email to