Fixing typo: Meant to say "Keep in mind that individual KNL cores are much less powerful than an individual Haswell *core*."
--Richard On Mon, Apr 3, 2017 at 11:36 AM, Richard Mills <richardtmi...@gmail.com> wrote: > Hi Justin, > > How is the MCDRAM (on-package "high-bandwidth memory") configured for your > KNL runs? And if it is in "flat" mode, what are you doing to ensure that > you use the MCDRAM? Doing this wrong seems to be one of the most common > reasons for unexpected poor performance on KNL. > > I'm not that familiar with the environment on Cori, but I think that if > you are building for KNL, you should add "-xMIC-AVX512" to your compiler > flags to explicitly instruct the compiler to use the AVX512 instruction > set. I usually use something along the lines of > > 'COPTFLAGS=-g -O3 -fp-model fast -xMIC-AVX512' > > (The "-g" just adds symbols, which make the output from performance > profiling tools much more useful.) > > That said, I think that if you are comparing 1024 Haswell cores vs. 1024 > KNL cores (so double the number of Haswell nodes), I'm not surprised that > the simulations are almost twice as fast using the Haswell nodes. Keep in > mind that individual KNL cores are much less powerful than an individual > Haswell node. You are also using roughly twice the power footprint (dual > socket Haswell node should be roughly equivalent to a KNL node, I > believe). How do things look on when you compare equal nodes? > > Cheers, > Richard > > On Mon, Apr 3, 2017 at 11:13 AM, Justin Chang <jychan...@gmail.com> wrote: > >> Hi all, >> >> On NERSC's Cori I have the following configure options for PETSc: >> >> ./configure --download-fblaslapack --with-cc=cc --with-clib-autodetect=0 >> --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn >> --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-64-bit-indices=1 >> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-cori-opt >> >> Where I swapped out the default Intel programming environment with that >> of Cray (e.g., 'module switch PrgEnv-intel/6.0.3 PrgEnv-cray/6.0.3'). I >> want to document the performance difference between Cori's Haswell and KNL >> processors. >> >> When I run a PETSc example like SNES ex48 on 1024 cores (32 Haswell and >> 16 KNL nodes), the simulations are almost twice as fast on Haswell nodes. >> Which leads me to suspect that I am not doing something right for KNL. Does >> anyone know what are some "optimal" configure options for running PETSc on >> KNL? >> >> Thanks, >> Justin >> > >