Dear Jatin,

it's very hard to tell what the problem is without additional details.

Can you share your input?
Can you try running without pool parallelism (to reduce the memory footprint)?

Since you _may_ be hitting a code-related problem, you can also consider opening a confidential issue on gitlab if you do not want do disclose some details.

Best,
Pietro



On 3/22/21 5:24 AM, Jatin Kashyap wrote:
Dear QE Community Members,

I am trying to run  Program PWSCF v.6.7MaX on the XSEDE Comet cluster with the given configuration[1]
But the code is exiting with an error[2].

Can anybody please help to find out how to fix it if it is not a machine-error?

Thank you.

[1]
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --mem=51G
#SBATCH --gres=gpu:p100:2

[2]
  iteration #  1     ecut=    40.00 Ry     beta= 0.70
Warning: ieee_inexact is signaling
     1
      Davidson diagonalization with overlap
  zhegvdx_gpu error: cusolverDnZpotrf failed!

  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
      Error in routine  cdiaghg_gpu (1):
       zhegvdx_gpu failed
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

      stopping ...



——
Jatin Kashyap
Ph.D. Student
Dr. Dibakar Datta Group
Department of Mechanical and Industrial Engineering
New Jersey Institute of Technology (NJIT)
University Heights
Newark, NJ 07102-1982
Phone- (201)889-5783
Email- [email protected] <mailto:[email protected]>


_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users


_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to