Dear Sara,

I'd suggest checking the following:

1. verify that the serial eigensolver is used (it's written at the
beginning of the output);

2. use the latest version (6.6a1) that will correctly report problems
with memory allocations during the iterative diagonalization.

Could you please also open an issue at
https://gitlab.com/QEF/q-e-gpu/-/issues and attach the input, the
pseudopotentials and the job script that you are using?

Thank you,
kind regards,
Pietro



On 8/29/20 6:33 PM, Sara Postorino wrote:
Hi QE users,

I am running PW on Marconi100 and experiencing problems during
digonalization. I am using version 6.5 (autoload of the modules on m100).
My system is a MoTe2 bilayer k mesh 39x39x1 with many bands due to the
fact that I will do a GW calculation on top of it. (The calculation
works if I do not add many bands)
I tried with 4000 and 3000 bands using Davidson diagonalization running
on 18 nodes:
Parallel version (MPI & OpenMP), running on    2304 processor cores
      Number of MPI processes:                72
      Threads/MPI process:                    32
When doin the calculation of the first point I get:

  Really copied g2kin H->D
  Really copied evc H->D
  Really copied et H->D
  Really copied vrs H->D
  dp_memcpy_d2h_c2dinvalid pitch argument           12

I also tried with Conjugate gradient algorithm but  it gets stuck at

  Really copied evc H->D
  Really copied et H->D
  Really copied h_diag H->D
  Really copied becp%nc H->D
  Really copied g2kin H->D
  Really copied vrs H->D

And here it takes forever. I left it running for more than 1 hour and it
didn't finish on k point and since I have 147 kpoints the computation
would be very expensive even if it worked.

I also tried to go down to 1000 bands (I need way more) and got
  Really copied g2kin H->D
  Really copied evc H->D
  Really copied et H->D
  Really copied vrs H->D
  zhegvdx_gpu error: cusolverDnZpotrf failed!

  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
      Error in routine  cdiaghg_gpu (1):
       zhegvdx_gpu failed
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Do you have any suggestion on how to fix this issue?
Thanks

Sara Postorino
PhD student
University of Rome Tor Vergata


<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
      Mail priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>


<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

_______________________________________________
Quantum ESPRESSO is supported by MaX (http://www.max-centre.eu/quantum-espresso
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users


Firma il tuo 5 per mille all’Università di Parma e aiuta così i nostri studenti 
che vogliono realizzare un’esperienza di studio all’estero - Indica 00308780345 
nella tua denuncia dei redditi.
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to