Re: [QE-users] QE 6.4 - slower with intel fftw? how to properly benchmark

Pietro Davide Delugas Fri, 01 Mar 2019 03:49:59 -0800

Hi Chris
it might it be happening exactly the opposite.

if you don't specify anything the configure tries all the options fromthe best to the worse and the usage for mkl is tested as first guess if I am not wrong. If you pass it a specific path just tries that one anddeals with it as expecting ordinary fftw library, so it may be failingin finding a working fft and turns on the internal one.


Could you send the make.inc files in the 2 cases or the config log ?
Pietro

On 03/01/2019 11:13 AM, Christoph Wolf wrote:

Dear all,

please forgive this "beginner" question but I am facing a weirdproblem. When compiling qe-6.4 (intel compiler, intel MPI+OpenMP) withor without intel's fftw libs I find that in openMP with 2 threads percore the intel fftw version is roughly "twice as slow" as the internal one


"internal"
     General routines
     calbec       :      2.69s CPU      2.70s WALL (    382 calls)
     fft          :      0.47s CPU      0.47s WALL (    122 calls)
     ffts         :      0.05s CPU      0.05s WALL (     12 calls)
     fftw         :     49.97s CPU     50.12s WALL (  14648 calls)
     Parallel routines
     PWSCF        :  1m45.03s CPU     1m46.59s WALL

"intel fftw"
     General routines
     calbec       :      6.36s CPU      3.20s WALL (     382 calls)
     fft          :      0.93s CPU      0.47s WALL (     121 calls)
     ffts         :      0.10s CPU      0.05s WALL (      12 calls)
     fftw         :    109.63s CPU     55.23s WALL (   14648 calls)
     Parallel routines
     PWSCF        :   3m18.32s CPU   1m41.01s WALL

as a benchmark I am running a perovskite with 120 k-points on 30processors (one node); There is no (noticeable) difference if I exportOMP_NUM_THREADS=1 (only MPI) so I guess I made some mistake during thebuild with regards to the libraries.


Build process is as below

module load intel19/compiler-19

module load intel19/impi-19


export FFT_LIBS="-L$MKLROOT/intel64"

export LAPACK_LIBS="-lmkl_blacs_intelmpi_lp64"

export CC=icc FC=ifort F77=ifort MPIF90=mpiifort MPICC=mpiicc


./configure --enable-parallel --with-scalapack=intel --enable-openmp


This detects BLAS_LIBS, LAPACK_LIBS, SCALAPACK_LIBS and FFT_LIBS.

I am not experienced with benchmarking so if my benchmark is garbageplease suggest a suitable system!


Thanks in advance!
Chris

--
Postdoctoral Researcher
Center for Quantum Nanoscience, Institute for Basic Science
Ewha Womans University, Seoul, South Korea


_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] QE 6.4 - slower with intel fftw? how to properly benchmark

Reply via email to