"Smith, Barry F." <bsm...@mcs.anl.gov> writes: > Jed, > > What does latency as a function of message size mean? It is in the plots
It's just the wall-clock time to ping-pong a message of that size. All the small sizes take the same amount of time (i.e., the latency), then transition to being network bandwidth limited for large sizes. > >> On Sep 21, 2019, at 11:15 PM, Jed Brown via petsc-dev >> <petsc-dev@mcs.anl.gov> wrote: >> >> Karl Rupp via petsc-dev <petsc-dev@mcs.anl.gov> writes: >> >>> Hi Junchao, >>> >>> thanks, these numbers are interesting. >>> >>> Do you have an easy way to evaluate the benefits of a CUDA-aware MPI vs. >>> a non-CUDA-aware MPI that still keeps the benefits of your >>> packing/unpacking routines? >>> >>> I'd like to get a feeling of where the performance gains come from. Is >>> it due to the reduced PCI-Express transfer >> >> It's NVLink, not PCI-express. >> >> I wonder if the single-node latency bugs on AC922 are related to these >> weird performance results. >> >> https://docs.google.com/spreadsheets/d/1amFJIbpvs9oJcUc-WntsFHO_C0LE7xFJeor-oElt0LY/edit#gid=0