what was the conv_thr in this example ?
the default is 1.d-6 which is the range of variations you are reporting.
If so, could you try with a tighter threshold ?
stefano

On 26/01/2016 17:55, Paolo Giannozzi wrote:
Very interesting, and excellent questions, for which unfortunately I have no clear answer (nor has anybody else, I am afraid).

One should obtain the same numbers - within the errors due to roundoff, though - in serial, OpenMP, MPI execution, and on different machines, and with different compilers and mathematical libraries. In practice, there are invariably small differences, that sometimes do not completely disappear even if one pushes convergence thresholds to very strict limits. In addition to noncolin-constrain_total.in <http://noncolin-constrain_total.in>, another notable offender is vdw-ts.in <http://vdw-ts.in>.

This could signal a small bug, but in my experience, most of those cases can be linked to specific optimized mathematical libraries or compiler versions. As long as we can blame somebody else, not a big problem :-)

Paolo

On Tue, Jan 26, 2016 at 12:23 PM, Nick Wilson <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I’ve been testing the OpenMP build of Quantum Espresso 5.3.0  on
    our system using the Intel compiler and MKL and have a question
    about variation of energy with the number of OpenMP threads used.

    I ran all the plane wave tests in the test-suite directory using
    between 1 and 16 OpenMP threads and they all gave consistent
    results apart from pw_noncolin/noncolin-constrain_total.in
<http://noncolin-constrain_total.in> which showed variation in between -55.54478325 Ry and -55.54478414 Ry.

    I ran the test through the Intel Inspector tool but that didn’t
    show up any threading deadlocks or data races.

    I dropped the compiler optimisation to -O0 and added the
    “-fp-model strict” and “-fp-model source” compiler flags but that
    had no effect.

    I tried using some of the relevant environment variables
    (KMP_DETERMINISTIC_REDUCTION=1 and MKL_CBWR=COMPATIBLE) which also
    had no effect.

    Changing to use the internal BLAS library resolved the issue so it
    looks to be MKL-related. It’s present with both the GNU and Intel
    compilers.

    I dropped back to an earlier version of MKL but the effect was
    still present.

    As it was thread-related I tried linking against the sequential
    version of MKL but that didn’t help.

    So, I guess my questions are:
    Should the results always be invariant of the number of OpenMP
    threads?
    Is there anything unique about the noncolin-constrain_total.in
    <http://noncolin-constrain_total.in> test case which would cause
    it to behave differently to the rest of the tests?

    Best regards,
    Nick Wilson


    System details:
      Intel Sandy Bridge E5-2650 CPU
      CentOS Linux release 7.2.1511
      MKL from Intel compiler 16.0.0
     GNU compiler version 4.8.5
    _______________________________________________
    Pw_forum mailing list
    [email protected] <mailto:[email protected]>
    http://pwscf.org/mailman/listinfo/pw_forum




--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222



_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

Reply via email to