Why do I see the max bandwidth of EPYC-7502 is 200GB/s, https://www.cpu-world.com/CPUs/Zen/AMD-EPYC%207502.html?
Your bandwidth is around 1/8 of the max. Is it because your machine only has one DIMM, thus only uses one memory channel? --Junchao Zhang On Fri, Apr 16, 2021 at 3:27 PM Jed Brown <[email protected]> wrote: > Blaise A Bourdin <[email protected]> writes: > > > Hi, > > > > I am test-driving hardware for a new machine for my group and having a > hard time making sense the output of the stream test: > > > > I am attaching the results and my reference (xeon 8260 nodes on QueenBee > 3 at LONI). > > > > If I understand correctly, on the AMD node, the memory bandwidth is > saturated with a single core. Is this expected? > > The comparison is not totally fair in that QB3 uses intel MPI and MPI > compilers, whereas the AMD node uses mvapich2, which I compiled with the > following options: ./configure > --prefix=/home/amduser/Development/mvapich2-2.3.5-gcc9.3 > --with-device=ch3:nemesis:tcp --with-rdma=gen2 --enable-cxx --enable-romio > --enable-fast=all --enable-g=dbg --enable-shared-libs=gcc --enable-shared > > > > Am I doing something wrong on the AMD node? > > It looks like it's oversubscribing some cores rather than spreading them > over the node. You should get around 200 GB/s on this node without using > streaming instructions (closer to 300 GB/s with those, but it isn't > representative of real-world code). Slightly less if you don't have NPS4 > activated. > > You can check your MPI docs and use make MPI_BINDING='--bind-to core', for > example. >
