Hello
It is strange because qe v7.3 is way faster than 6.7, especially on GPUs. It 
has to do with some fine-tuning in using the cluster.
You should ask help to the system managers of your cluster.
Just trying to guess:

  1.
The problem might be hyperthreading, so make sure that OMP_NUM_THREADS is set 
to 1.
  2.
try to see in the GPU MPI aware communications are working compile with 
--with-cuda-mpi=no

hope it helps
best regards
Pietro
________________________________
From: users <[email protected]> on behalf of Niharika 
Joshi <[email protected]>
Sent: Tuesday, October 15, 2024 09:34
To: Quantum ESPRESSO users Forum <[email protected]>
Subject: [QE-users] Large time lag post software upgradation in HPC system

Dear QE users,
I am using a HPC resource for more than a year with QE(6.7Max GPU) without any 
issue. My present research problem focuses on studying methane and carbon 
dioxide adsorption on spinel surfaces. The system is large with more than 380 
atoms and ~3500 electrons. Normally, 2-3 ionic cycles (with 60-70 iterations) 
gets complete within a day. However, recently there has been some software 
upgradation in the computing system after which I have observed a huge time lag 
in my calculations. Currently, only few iterations are performed in 24 hours.

Please find below two tables listing the details of hardware specifications and 
upgradation information of software in the computing system.

Component
Specification
CPU
AMD EPYC 7742 64C 2.25GHz
CPU core
128 cores (Dual socket each with 64 cores); 256 cores with hyper-threading
L3 cache
256 Mb
RAM
1 TB
GPU
NVIDIA A100-SXM4
GPU Memory
 40 Gb
Total no. of GPU per node
8
Storage
10.5 PiB PFS based storage
Networking
Mellonex ConnectX-6 VPI (infiniband HDR)


Software
Specification of upgradation
OS
from Ubuntu 20.04.02 (DGX OS 5.0.5) to Ubuntu 22.04.04 (DGX OS 6.3.0)
Kernel
 from 5.4.0-80-generic to 5.15.0-1062-nvidia
CUDA
10.1 to 12.4 (below versions are also available)
NVIDIA Driver version
450.142.00 to 550.90.07

Post software upgradation, QE-7.3 was installed in the following manner:

Step 1 : Source up the HPC-SDK environment:
source /opt/hpc-sdk-23.9/env.sh

Step 2. Set up the environment:
./configure --prefix=installation-location --with-cuda=$CUDA_ROOT 
--with-cuda-runtime=12.2 --with-cuda-cc=80 --enable-openmp --with-scalapack=no 
--with-cuda-mpi=yes

Step 3. Compile the source code:
make all -j8

Step 4. Install the compiled binaries:
make install

Kindly, suggest some solution to this problem. Any advice/suggestion at this 
point would really be very helpful to me.

With best regards,
Niharika Joshi,
National Post Doctoral Fellow,
CSIR National Chemical Laboratory, Pune,
Maharashtra-411008, India.



_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to