Hi,
 
I’ve been testing the OpenMP build of Quantum Espresso 5.3.0  on our system 
using the Intel compiler and MKL and have a question about variation of energy 
with the number of OpenMP threads used.
 
I ran all the plane wave tests in the test-suite directory using between 1 and 
16 OpenMP threads and they all gave consistent results apart from 
pw_noncolin/noncolin-constrain_total.in which showed variation in  between 
-55.54478325 Ry and -55.54478414 Ry.

I ran the test through the Intel Inspector tool but that didn’t show up any 
threading deadlocks or data races.
 
I dropped the compiler optimisation to -O0 and added the “-fp-model strict” and 
“-fp-model source” compiler flags but that had no effect.
 
I tried using some of the relevant environment variables 
(KMP_DETERMINISTIC_REDUCTION=1 and MKL_CBWR=COMPATIBLE) which also had no 
effect.
 
Changing to use the internal BLAS library resolved the issue so it looks to be 
MKL-related. It’s present with both the GNU and Intel compilers.
 
I dropped back to an earlier version of MKL but the effect was still present.
 
As it was thread-related I tried linking against the sequential version of MKL 
but that didn’t help.
 
So, I guess my questions are:
Should the results always be invariant of the number of OpenMP threads?
Is there anything unique about the noncolin-constrain_total.in test case which 
would cause it to behave differently to the rest of the tests?
 
Best regards,
Nick Wilson
 
 
System details:
  Intel Sandy Bridge E5-2650 CPU
  CentOS Linux release 7.2.1511
  MKL from Intel compiler 16.0.0
 GNU compiler version 4.8.5
_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

Reply via email to