This is an academic exercise, obviously. The curve shown comes from one pair of ranks running on the same node alternating between MPI_Send and MPI_Recv. The most likely suspect is a cache effect, but rather than assuming, I was curious if there might be any other aspects of the implementation at work.
Pete Pete, how did you measure the bandwidth ? iirc, IMB benchmark does not reuse send and recv buffers, so the results could be different. also, you might want to use a logarithmic scale for the message size, so information for small messages is easier to read. Cheers, Gilles On Thursday, March 10, 2016, BRADLEY, PETER C PW <peter.c.bradley_at_[hidden]> wrote: > I’m curious what causes the hump in the pingpong bandwidth curve when > running on shared memory. Here’s an example running on a fairly antiquated > single-socket 4 core laptop with linux (2.6.32 kernel). Is this a cache > effect? Something in OpenMPI itself, or a combination? > > > > > > [image: Macintosh HD:Users:up:Pictures:bandwidth_onepair_onenode.png] > > > > Pete > > >