Jed,
What does latency as a function of message size mean? It is in the plots > On Sep 21, 2019, at 11:15 PM, Jed Brown via petsc-dev <petsc-dev@mcs.anl.gov> > wrote: > > Karl Rupp via petsc-dev <petsc-dev@mcs.anl.gov> writes: > >> Hi Junchao, >> >> thanks, these numbers are interesting. >> >> Do you have an easy way to evaluate the benefits of a CUDA-aware MPI vs. >> a non-CUDA-aware MPI that still keeps the benefits of your >> packing/unpacking routines? >> >> I'd like to get a feeling of where the performance gains come from. Is >> it due to the reduced PCI-Express transfer > > It's NVLink, not PCI-express. > > I wonder if the single-node latency bugs on AC922 are related to these > weird performance results. > > https://docs.google.com/spreadsheets/d/1amFJIbpvs9oJcUc-WntsFHO_C0LE7xFJeor-oElt0LY/edit#gid=0