Attached is a petsc streams result kindly provided by a hardware vendor for a single compute node, dual socket, with two AMD epyc 9355 processors. Each processor has 32 cores, 12 DDR5 memory channels and mem BW around 600 GB/s.
* It is not immediately clear which line corresponds to which y-axis. Could future versions of petsc please color the axis label with the matching line color? * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = 900 GB/s and not closer to 1200 GB/s? * The speed-up seems to be 12 out of 64, provided multiples of 8 cores are used. As expected given 12 memory channels? * Does the zig-zag pattern indicate a pinning problem, or is it unavoidable given the 8 core building block of these type of processors? Chris dr. ir. Christiaan Klaij | senior researcher Research & Development | CFD Development T +31 317 49 33 44 | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsmmm_T4I$
