Dear Jatin,
it's very hard to tell what the problem is without additional details.
Can you share your input?
Can you try running without pool parallelism (to reduce the memory
footprint)?
Since you _may_ be hitting a code-related problem, you can also consider
opening a confidential issue on gitlab if you do not want do disclose
some details.
Best,
Pietro
On 3/22/21 5:24 AM, Jatin Kashyap wrote:
Dear QE Community Members,
I am trying to run Program PWSCF v.6.7MaX on the XSEDE Comet cluster
with the given configuration[1]
But the code is exiting with an error[2].
Can anybody please help to find out how to fix it if it is not a
machine-error?
Thank you.
[1]
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --mem=51G
#SBATCH --gres=gpu:p100:2
[2]
iteration # 1 ecut= 40.00 Ry beta= 0.70
Warning: ieee_inexact is signaling
1
Davidson diagonalization with overlap
zhegvdx_gpu error: cusolverDnZpotrf failed!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine cdiaghg_gpu (1):
zhegvdx_gpu failed
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stopping ...
——
Jatin Kashyap
Ph.D. Student
Dr. Dibakar Datta Group
Department of Mechanical and Industrial Engineering
New Jersey Institute of Technology (NJIT)
University Heights
Newark, NJ 07102-1982
Phone- (201)889-5783
Email- [email protected] <mailto:[email protected]>
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users