Hello
Usually the "libgomp: TODO" error message occurs because the program has been linked with a cuda version that is not compatible with the gpu driver that is running in the nodes. You should inquire with the system managers of your cluster about the right toolchain to use. Or check the cuda and driver versions and report them in the forum to get more help. Pietro Il 24 set 2024 7:06 PM, "Hazra, Shilpa" <[email protected]> ha scritto: Hello, I am using the cuda version of Quantum Espresso. The input and the job submission script I am using are written bellow. INPUT: &control calculation = 'vc-relax' prefix = 'silicon' outdir = './tmp/' pseudo_dir = './' etot_conv_thr = 1e-5 forc_conv_thr = 1e-4 / &system ibrav=2, celldm(1) =14, nat=2, ntyp=1, ecutwfc=30 / &electrons conv_thr=1e-8 / &ions / &cell cell_dofree='ibrav' / ATOMIC_SPECIES Si 28.0855 Si.pbe-n-kjpaw_psl.1.0.0.UPF ATOMIC_POSITIONS (alat) Si 0.00 0.00 0.00 0 0 0 Si 0.25 0.25 0.25 0 0 0 K_POINTS (automatic) 6 6 6 1 1 1 ~ ~ SCRIPT: #!/bin/bash #SBATCH --job-name="test2" #SBATCH --output="test2.out" #SBATCH --partition=gpuA40x4 #SBATCH --mem=16G #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=16 # spread out to use 1 core per numa, set to 64 if tasks is 1 #SBATCH --constraint="scratch" #SBATCH --gpus-per-node=4 #SBATCH --gpu-bind=closest # select a cpu close to gpu on pci bus topology #SBATCH --account=bcox-delta-gpu # <- match to a "Project" returned by the "accounts" command #SBATCH -t 08:00:00 #SBATCH -e slurm-%j.err #SBATCH -o slurm-%j.out module reset module load nvhpc/22.11 module load openmpi/4.1.5+cuda module load quantum-espresso/7.3.1+cuda export OMP_NUM_THREADS=16 # if code is not multithreaded, otherwise set to 8 or 16 srun pw.x -N 1 -n 1 test2.in > test2.out Moreover, the job is running well. But I am not getting any output data printed in the output file even after running the job for 30 hours. Along with that, I am getting the following error massage in the .err file. libgomp: TODO srun: error: gpub075: task 0: Exited with exit code 1 I am really not getting why this error is happening and the error is coming. I tried to adjust the script for parallel computing. However, all the time I am getting the same error. Please help me out form this problem. If you could help me with solving this problem, it would be really beneficial for me. Thank you. Sincerely, Shilpa Hazra Ph.D., Chemistry, University of Illinois Chicago
_______________________________________________ The Quantum ESPRESSO community stands by the Ukrainian people and expresses its concerns about the devastating effects that the Russian military offensive has on their country and on the free and peaceful scientific, cultural, and economic cooperation amongst peoples _______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu) users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
