I don't think that the electron-phonon calculation has been ported to GPUs

Paolo


On 11/25/25 06:52, Dolon Pal via users wrote:
Dear Quantum ESPRESSO Developers and Community,

I am writing to report a persistent runtime error in the GPU-accelerated version of |ph.x| (Quantum ESPRESSO v7.5) when calculating electron- phonon coefficients using the OpenACC port.

While the code successfully calculates the Dynamical Matrices and Frequencies on the GPU, it consistently crashes during the final electron-phonon interaction step (routine |elphon|) with a File I/O error, specifically related to the temporary file |a2Fsave|.

*1. System and Compilation Details:*

  *

    *Version:* Quantum ESPRESSO v7.5 (GitLab release)

  *

    *Compiler:* NVIDIA HPC SDK v24.9

  *

    *Configuration:* |./configure --enable-openacc --with-cuda=yes --
    with-cuda-cc=89 --with-cuda-runtime=12.6|

  *

    *Hardware:* NVIDIA RTX 4090 (Ada Lovelace)

  *

    *MPI:* OpenMPI (via NVIDIA HPC SDK)

*2. The Issue:* When running |ph.x| with |electron_phonon = 'interpolated'| (or any mode that triggers |elphon|), the execution aborts immediately after diagonalizing the dynamical matrix for the first q-point. The crash occurs regardless of the MPI parallelization level (reproduced with both |-np 1| and |-np 8|).

*3. Error Log:* The crash points to a read error in |elphon.f90|  attempting to read a file that appears to be empty or not flushed to disk.

    |FIO-F-217/list-directed read/unit=40/attempt to read past end of
    file. File name = './out/mgb2.a2Fsave', formatted, sequential access
    record = 1 In source file /path/to/q-e/PHonon/PH/elphon.f90, at line
    number 847 |

|File name = './out/mgb2.a2Fsave', formatted, sequential access record = 1 In source file /path/to/q-e/PHonon/PH/elphon.f90, at line number 847 |

*4. Reproduction Case (MgB2):* I reproduced this using a standard MgB2 test case.

/Input snippet (|ph.in|):/

Fortran

|&INPUTPH tr2_ph = 1.0d-14, prefix = 'mgb2', outdir = './out', fildyn = 'mgb2.dyn', fildvscf = 'mgb2.dvscf', electron_phonon = 'interpolated', ! <--- Triggers the crash trans = .true., ldisp = .true., nq1=6, nq2=6, nq3=4 / |

*5. Observations:*

 1.

    *Pure Phonons work:* If I comment out |electron_phonon|, the GPU run
    finishes successfully and writes |.dyn| and |.dvscf|files.

 2.

    *CPU Works:* The exact same input runs successfully on the CPU-only
    binary (gfortran compilation).

 3.

    *File Incompatibility:* I attempted to run the heavy phonon
    calculation on the GPU and the final electron-phonon collection on
    the CPU (using |recover=.true.| or |trans=.false.|), but the CPU
    binary cannot read the GPU-generated |.dvscf|/binary files
    ("problems reading u" error), likely due to binary format/padding
    differences between |nvfortran|and |gfortran|.

It appears there is a race condition or file handling issue in the OpenACC implementation of the |elphon| routine where the |a2Fsave| file is read before it is successfully written/closed.

Any advice on a workaround or a patch for |elphon.f90| to stabilize the GPU I/O would be greatly appreciated.

Thank you for your time and for developing this software.

Best regards,

Dholon Kumar Paul

Research Assistant, BRAC University, Bangladesh


_______________________________________________________________________________
The Quantum ESPRESSO Foundation stands in solidarity with all civilians 
worldwide who are victims of terrorism, military aggression, and indiscriminate 
warfare.
--------------------------------------------------------------------------------
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 206, 33100 Udine Italy, +39-0432-558216

_______________________________________________________________________________
The Quantum ESPRESSO Foundation stands in solidarity with all civilians 
worldwide who are victims of terrorism, military aggression, and indiscriminate 
warfare.
--------------------------------------------------------------------------------
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to