Hi Federico

have  you been working with 6.3 stable version or with the develop version(s) ?
does the error occur with this system or also with other systems ?
what MPI library are you using ?
have you tried to compile using the ELPA library ?

sorry for replying with more questions  and no answer
Pietro



On 03/01/2019 04:58 PM, IORI, Federico wrote:
Hi everybody.
I am working with QE 6.3 and from today also testing QE 6.4 on a CuFeO2 supercell (4x4x1, large system) Recently (since some weeks) I found randomly but very often this error poping up after the 1st - 2nd scf iterations after the initialization of the wfc.

forrtl: severe (71): integer divide by zero
Image              PC                Routine Line        Source
pw.x               0000000000CFE59E  Unknown Unknown  Unknown
libpthread-2.22.s  00007FFFEFCA5C10  Unknown Unknown  Unknown
libmkl_scalapack_  00007FFFF790512D  pzhbrdb_ Unknown  Unknown
libmkl_scalapack_  00007FFFF781765C  pzherdb_ Unknown  Unknown
libmkl_scalapack_  00007FFFF77842B9  mkl_pzheevd0_ Unknown  Unknown
libmkl_scalapack_  00007FFFF7782984  mkl_pzheevdm_ Unknown  Unknown
libmkl_scalapack_  00007FFFF778191C  pzheevd_ Unknown  Unknown
pw.x               0000000000B8E667 zhpev_module_mp_p        1566  zhpev_drv.f90 pw.x               0000000000B59CD3 pcdiaghg_                 339  cdiaghg.f90 pw.x               0000000000A8B37A pcegterg_                 957  cegterg.f90 pw.x               0000000000659331 diag_bands_               497  c_bands.f90 pw.x               00000000006569FB c_bands_                  101  c_bands.f90 pw.x               000000000040CAF1 electrons_scf_            566  electrons.f90 pw.x               0000000000409C04 electrons_                152  electrons.f90 pw.x               00000000005765BA run_pwscf_                133  run_pwscf.f90 pw.x               00000000004077B5 MAIN__                     98  pwscf.f90
pw.x               000000000040761E  Unknown Unknown  Unknown
libc-2.22.so <http://libc-2.22.so/> 00007FFFEF613725  __libc_start_main     Unknown  Unknown
pw.x               0000000000407529  Unknown Unknown  Unknown

Both QE versions are compiled with INTEL 2018 on a Xeon Gold 6128 based cluster linking MKL libraries
The error is reproduced on all the different nodes of the cluster.

I check that the zhpev_drv.f90 routine, where the error seems to come from,  is in theLAXlib  part, but I don't have still any idea about the why and the how.
I am not convinced it is related to memory problem.

Is there anyone who can give me some hints?

In attachment the pw.in <http://pw.in> and pw.out for sake of completeness.

Thanks so much.
Federico

--
Federico IORI

Computational material scientist

Paris-Saclay Research Center

1 chemin de la Porte des Loges <https://www.google.com/maps/place/Air+Liquide/@48.8297381,2.2016685,11.75z/data=%214m5%213m4%211s0x47e67e61a4fbbdc7:0xca3bea9e80059880%218m2%213d48.7630415%214d2.1333045> Les Loges en Josas – 78354 Jouy en Josas cedex Mail: [email protected] <mailto:[email protected]>

Phone: +33 7 621 605 15




_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to