Very interesting, and excellent questions, for which unfortunately I have no clear answer (nor has anybody else, I am afraid).
One should obtain the same numbers - within the errors due to roundoff, though - in serial, OpenMP, MPI execution, and on different machines, and with different compilers and mathematical libraries. In practice, there are invariably small differences, that sometimes do not completely disappear even if one pushes convergence thresholds to very strict limits. In addition to noncolin-constrain_total.in, another notable offender is vdw-ts.in. This could signal a small bug, but in my experience, most of those cases can be linked to specific optimized mathematical libraries or compiler versions. As long as we can blame somebody else, not a big problem :-) Paolo On Tue, Jan 26, 2016 at 12:23 PM, Nick Wilson < [email protected]> wrote: > Hi, > > I’ve been testing the OpenMP build of Quantum Espresso 5.3.0 on our > system using the Intel compiler and MKL and have a question about variation > of energy with the number of OpenMP threads used. > > I ran all the plane wave tests in the test-suite directory using between 1 > and 16 OpenMP threads and they all gave consistent results apart from > pw_noncolin/noncolin-constrain_total.in which showed variation in > between -55.54478325 Ry and -55.54478414 Ry. > > I ran the test through the Intel Inspector tool but that didn’t show up > any threading deadlocks or data races. > > I dropped the compiler optimisation to -O0 and added the “-fp-model > strict” and “-fp-model source” compiler flags but that had no effect. > > I tried using some of the relevant environment variables > (KMP_DETERMINISTIC_REDUCTION=1 and MKL_CBWR=COMPATIBLE) which also had no > effect. > > Changing to use the internal BLAS library resolved the issue so it looks > to be MKL-related. It’s present with both the GNU and Intel compilers. > > I dropped back to an earlier version of MKL but the effect was still > present. > > As it was thread-related I tried linking against the sequential version of > MKL but that didn’t help. > > So, I guess my questions are: > Should the results always be invariant of the number of OpenMP threads? > Is there anything unique about the noncolin-constrain_total.in test case > which would cause it to behave differently to the rest of the tests? > > Best regards, > Nick Wilson > > > System details: > Intel Sandy Bridge E5-2650 CPU > CentOS Linux release 7.2.1511 > MKL from Intel compiler 16.0.0 > GNU compiler version 4.8.5 > _______________________________________________ > Pw_forum mailing list > [email protected] > http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________ Pw_forum mailing list [email protected] http://pwscf.org/mailman/listinfo/pw_forum
