Hello Pamela, I don't know whether it is clear or not, so I apologize if I repeat obvious concepts. I just bought a Threadripper 3990X with 64 core 128 threads. As far as I remember the 3960X should have 24 core - 48 threads. It is very very important to don't use more than 24 cores on 3960X . Simply forget about hyperthreading. No need to disable it in the BIOS, but simply count the real number of cores.
I use gcc 9.3.0 and the new gcc 10 should be even better for AMD cpus. With openblas 0.3.12 I found that my 8-cores home Ryzen 3800X is fast as a Xeon 12 cores E5-2680 using quantum espresso 6.4.1 Carlo Il giorno mar 17 nov 2020 alle ore 19:24 Pamela Whitfield < [email protected]> ha scritto: > Michal > > I have a very similar use-case and looked into many of the same issues > when I got my Threadripper 3960X system at the beginning of the year to > supplement my old dual-Xeon setup. In the past few days I've been > revisiting compilation as I got hold of a Quadro GV100 for GPU acceleration > of my optimizations. > > Basically it seems as though code compiled for Zen2 either can't handle > code compiled for both MPI and OpenMP at all, or does so poorly even when > it runs. > Best performance for pw.x on v6.5 (I've been playing with GIPAW and > there's no 6.6 compatible version yet) has been with a simple gcc OpenMPI > compilation without openmp threading and with about 20 MPI cores on my 24 > core CPU. Compiling with GCC or PGI compiler made little difference, > although only the more recent PGI compilers will have zen2 optimization. > I get little benefit from Intel MKL over openblas/lapack/fftw3 even with > the debug tweaks, etc. > Puget Systems numbers with other programs suggest that OpenMP only > performs better than OpenMPI with Threadripper but I find the opposite with > QE. > I did try disabling hyperthreading in the BIOS but that made no difference > to the performance. > > GPU compilation really shows the issue with MPI/OpenMP clashing. With the > Xeons I could compile code with MKL that would run well on a Quadro K6000 > while offloading to the CPU with MPI when needed. It could still be a > compiler issue (have to use PGI with the GPU version) but it just doesn't > work with the 3960X, and some things don't thread well with pure OpenMP > (e..g dftd3 versus dftd2) so I'll still need to use separately compiled > versions of 6.5 for different problems. > > BTW with a dual CPU system you may benefit from pinning threads to > particular CPUs - it works on the dual Xeon in any case. My Threadripper > balances the load across the cores in a pretty dynamic manner and that's on > a single socket. > > Best regards > Pam Whitfield > > Independent Consultant > > > > > > Message: 1 > Date: Mon, 16 Nov 2020 15:19:04 +0100 > From: Michal Husak <[email protected]> > To: Quantum ESPRESSO users Forum <[email protected]> > Subject: [QE-users] Sub optimal performance on 32 core AMD machine > Message-ID: <[email protected]> > Content-Type: text/plain; charset="UTF-8"; format=flowed > > I had purchased a new PC with 2x 16 core AMD EPYC processors . 64 > cores with hyper threading ... > I was hoping my QM programs (Quantum Espresso, CASTEP) will run on the new > system faster, than on my old 4 core i7 Intel machine (8 year old) .... > > To my great surprise, the opposite is almost true :-(. > My main task is scf and geometry optimization of middle sized organic > molecular crystals (abut 100 C,H,N per unit cell) ... > > I was playing with OpenMPI/OpenMP setup changes ... > I was playing with the secret MKL_DEBUG_CPU_TYPE=5 parameter > (responsible for slow run of Intel MKL compiled code on AMD) ... > > Nothing helps, the best speed is obteined when I use only 4 cores > (OpenMPI or OpenMP - results similar) ... > Using 16 or 32 cores gives almost no benefit ... > The CPU load for run on 1/4/816/32 coresponds to the nubmer of CPU > set = they try to do something ... > > Any idea what I should check, try optimize ? > > Maybe the bottleneck is memory access, not CPU power (I have 128 > GB almost not used RAM) ? > > Michal Husak > > UCT Prague > > > _______________________________________________ > Quantum ESPRESSO is supported by MaX (www.max-centre.eu) > users mailing list [email protected] > https://lists.quantum-espresso.org/mailman/listinfo/users -- ------------------------------------------------------------ Prof. Carlo Nervi [email protected] Tel:+39 0116707507/8 Fax: +39 0116707855 - Dipartimento di Chimica, via P. Giuria 7, 10125 Torino, Italy. http://lem.ch.unito.it/ *ICCC2020 has been postponed at 2022* ICCC 2022 28 August - 2 September 2022, Rimini, Italy: http://www.iccc2020.com International Conference on Coordination Chemistry (ICCC 2022)
_______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu) users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
