what was the conv_thr in this example ?
the default is 1.d-6 which is the range of variations you are reporting.
If so, could you try with a tighter threshold ?
stefano
On 26/01/2016 17:55, Paolo Giannozzi wrote:
Very interesting, and excellent questions, for which unfortunately I
have no clear answer (nor has anybody else, I am afraid).
One should obtain the same numbers - within the errors due to
roundoff, though - in serial, OpenMP, MPI execution, and on different
machines, and with different compilers and mathematical libraries. In
practice, there are invariably small differences, that sometimes do
not completely disappear even if one pushes convergence thresholds to
very strict limits. In addition to noncolin-constrain_total.in
<http://noncolin-constrain_total.in>, another notable offender is
vdw-ts.in <http://vdw-ts.in>.
This could signal a small bug, but in my experience, most of those
cases can be linked to specific optimized mathematical libraries or
compiler versions. As long as we can blame somebody else, not a big
problem :-)
Paolo
On Tue, Jan 26, 2016 at 12:23 PM, Nick Wilson
<[email protected]
<mailto:[email protected]>> wrote:
Hi,
I’ve been testing the OpenMP build of Quantum Espresso 5.3.0 on
our system using the Intel compiler and MKL and have a question
about variation of energy with the number of OpenMP threads used.
I ran all the plane wave tests in the test-suite directory using
between 1 and 16 OpenMP threads and they all gave consistent
results apart from pw_noncolin/noncolin-constrain_total.in
<http://noncolin-constrain_total.in> which showed variation in
between -55.54478325 Ry and -55.54478414 Ry.
I ran the test through the Intel Inspector tool but that didn’t
show up any threading deadlocks or data races.
I dropped the compiler optimisation to -O0 and added the
“-fp-model strict” and “-fp-model source” compiler flags but that
had no effect.
I tried using some of the relevant environment variables
(KMP_DETERMINISTIC_REDUCTION=1 and MKL_CBWR=COMPATIBLE) which also
had no effect.
Changing to use the internal BLAS library resolved the issue so it
looks to be MKL-related. It’s present with both the GNU and Intel
compilers.
I dropped back to an earlier version of MKL but the effect was
still present.
As it was thread-related I tried linking against the sequential
version of MKL but that didn’t help.
So, I guess my questions are:
Should the results always be invariant of the number of OpenMP
threads?
Is there anything unique about the noncolin-constrain_total.in
<http://noncolin-constrain_total.in> test case which would cause
it to behave differently to the rest of the tests?
Best regards,
Nick Wilson
System details:
Intel Sandy Bridge E5-2650 CPU
CentOS Linux release 7.2.1511
MKL from Intel compiler 16.0.0
GNU compiler version 4.8.5
_______________________________________________
Pw_forum mailing list
[email protected] <mailto:[email protected]>
http://pwscf.org/mailman/listinfo/pw_forum
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum