Re: [QE-users] [ge-GPU] compiling q-e-gpu-qe-gpu-6.7
Dear Pietro, Thanks for sharing the link. I got the error of missing v0.3.1.tar.gz file in the archive folder, so I downloaded it manually. Everything went well and I could run jobs. However, there are a few things that are unclear to me. config options: ./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp --with-scalapack=no The make.sys file is changed based on the install instructions on this page: https://gitlab.com/QEF/q-e-gpu/-/wikis/home DFLAGS = -D__CUDA -D__DFTI -D__MPI DFLAGS = -D__CUDA -D__DFTI -D__MPI__GPU_MPI Since two cards are installed on the mainboard, I ran the jobs in this form: mpirun -np 2 pw.x -input file.in | tee file.out The hardware specifications in the output file: GPU used by master process: Device Number: 0 Device name: TITAN V Compute capability : 70 Ratio of single to double precision performance : 2 Memory Clock Rate (KHz): 85 Memory Bus Width (bits): 3072 Peak Memory Bandwidth (GB/s): 652.80 I just wonder why the second card (device # 1) is not printed in the output. Since -D__DFTI is printed in the make.sys file, the MKL and FFTW of Intel Parallel Studio is utilized. Is this an appropriate config in order to get the best performance? Is it possible to compile the code for multiple graphic cards with different cuda-cc ? I’m sorry for asking so many questions. I appreciate your time in responding to whatever you can, when you are able to find the time. Best, Mohammad ShirazU On Sun, Dec 20, 2020 at 9:08 PM Pietro Bonfa' wrote: > Dear Mohammad, > > for some reason you are having trouble accessing gitlab. I uploaded a > package that includes all dependencies and can be compiled without > network access. You can find it here: > > > https://univpr-my.sharepoint.com/:u:/g/personal/pietro_bonfa_unipr_it/EV-nHENjf1lFkat0RvJypFIBap2o92v9BzG75po06z48WA?e=uiDjDD > > Best wishes, > Pietro > > -- > Pietro Bonfà, > University of Parma > > On 12/19/20 7:27 AM, Mohammad Moaddeli wrote: > > Dear Louis and Pietro, > > > > *With the config options as following:* > > * > > * > > *./configure FC=pgf90 F90=pgf90 CC=pgcc > > --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.0 > > --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp > > --with-scalapack=no* > > * > > * > > *results is:* > > > > checking build system type... x86_64-pc-linux-gnu > > checking ARCH... x86_64 > > checking setting AR... ... ar > > checking setting ARFLAGS... ... ruv > > checking whether the Fortran compiler works... yes > > checking for Fortran compiler default output file name... a.out > > checking for suffix of executables... > > checking whether we are cross compiling... no > > checking for suffix of object files... o > > checking whether we are using the GNU Fortran compiler... no > > checking whether pgf90 accepts -g... yes > > configure: WARNING: F90 value is set to be consistent with value of > MPIF90 > > checking for mpiifort... no > > checking for mpif90... mpif90 > > checking whether we are using the GNU Fortran compiler... no > > checking whether mpif90 accepts -g... yes > > checking version of mpif90... nvfortran 20.11-0 > > checking for Fortran flag to compile .f90 files... none > > setting F90... nvfortran > > setting MPIF90... mpif90 > > checking whether we are using the GNU C compiler... yes > > checking whether pgcc accepts -g... yes > > checking for pgcc option to accept ISO C89... none needed > > setting CC... pgcc > > setting CFLAGS... -fast -Mpreprocess > > using F90... nvfortran > > setting FFLAGS... -O1 > > setting F90FLAGS... $(FFLAGS) > > setting FFLAGS_NOOPT... -O0 > > setting CPP... cpp > > setting CPPFLAGS... -P -traditional -Uvector > > setting LD... mpif90 > > setting LDFLAGS... > > checking for Fortran flag to compile .f90 files... (cached) none > > checking whether Fortran compiler accepts -Mcuda=cuda11.0... yes > > checking for nvcc... > > /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc > > checking whether nvcc works... yes > > checking for cuInit in -lcuda... no > > configure: error: in `/codes/qe_6.7_GPU/q-e-gpu-qe-gpu-6.7': > > configure: error: Couldn't find libcuda > > See `config.log' for more details > > > > *Although the option --enable-cuda-env-check=no resulted in the > > configuration finished:* > > * > > * > > *./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=yes > > --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 > > --enable-openmp --with-scalapack=no* > > > > checking build system type... x86_64-pc-linux-gnu > > checking ARCH... x86_64 > > checking setting AR... ... ar > > checking setting ARFLAGS... ... ruv > > checking whether the Fortran compiler works... yes > > checking for Fortran compiler default output file name... a.out > > checking for suffix of executables... > > checking whether we are cross compiling... no > > checking for
Re: [QE-users] [ge-GPU] compiling q-e-gpu-qe-gpu-6.7
Dear Mohammad, for some reason you are having trouble accessing gitlab. I uploaded a package that includes all dependencies and can be compiled without network access. You can find it here: https://univpr-my.sharepoint.com/:u:/g/personal/pietro_bonfa_unipr_it/EV-nHENjf1lFkat0RvJypFIBap2o92v9BzG75po06z48WA?e=uiDjDD Best wishes, Pietro -- Pietro Bonfà, University of Parma On 12/19/20 7:27 AM, Mohammad Moaddeli wrote: Dear Louis and Pietro, *With the config options as following:* * * *./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.0 --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp --with-scalapack=no* * * *results is:* checking build system type... x86_64-pc-linux-gnu checking ARCH... x86_64 checking setting AR... ... ar checking setting ARFLAGS... ... ruv checking whether the Fortran compiler works... yes checking for Fortran compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU Fortran compiler... no checking whether pgf90 accepts -g... yes configure: WARNING: F90 value is set to be consistent with value of MPIF90 checking for mpiifort... no checking for mpif90... mpif90 checking whether we are using the GNU Fortran compiler... no checking whether mpif90 accepts -g... yes checking version of mpif90... nvfortran 20.11-0 checking for Fortran flag to compile .f90 files... none setting F90... nvfortran setting MPIF90... mpif90 checking whether we are using the GNU C compiler... yes checking whether pgcc accepts -g... yes checking for pgcc option to accept ISO C89... none needed setting CC... pgcc setting CFLAGS... -fast -Mpreprocess using F90... nvfortran setting FFLAGS... -O1 setting F90FLAGS... $(FFLAGS) setting FFLAGS_NOOPT... -O0 setting CPP... cpp setting CPPFLAGS... -P -traditional -Uvector setting LD... mpif90 setting LDFLAGS... checking for Fortran flag to compile .f90 files... (cached) none checking whether Fortran compiler accepts -Mcuda=cuda11.0... yes checking for nvcc... /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc checking whether nvcc works... yes checking for cuInit in -lcuda... no configure: error: in `/codes/qe_6.7_GPU/q-e-gpu-qe-gpu-6.7': configure: error: Couldn't find libcuda See `config.log' for more details *Although the option --enable-cuda-env-check=no resulted in the configuration finished:* * * *./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp --with-scalapack=no* checking build system type... x86_64-pc-linux-gnu checking ARCH... x86_64 checking setting AR... ... ar checking setting ARFLAGS... ... ruv checking whether the Fortran compiler works... yes checking for Fortran compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU Fortran compiler... no checking whether pgf90 accepts -g... yes configure: WARNING: F90 value is set to be consistent with value of MPIF90 checking for mpiifort... no checking for mpif90... mpif90 checking whether we are using the GNU Fortran compiler... no checking whether mpif90 accepts -g... yes checking version of mpif90... nvfortran 20.11-0 checking for Fortran flag to compile .f90 files... none setting F90... nvfortran setting MPIF90... mpif90 checking whether we are using the GNU C compiler... yes checking whether pgcc accepts -g... yes checking for pgcc option to accept ISO C89... none needed setting CC... pgcc setting CFLAGS... -fast -Mpreprocess using F90... nvfortran setting FFLAGS... -O1 setting F90FLAGS... $(FFLAGS) setting FFLAGS_NOOPT... -O0 setting CPP... cpp setting CPPFLAGS... -P -traditional -Uvector setting LD... mpif90 setting LDFLAGS... checking for Fortran flag to compile .f90 files... (cached) none checking whether Fortran compiler accepts -Mcuda=cuda11.0... yes checking for /usr/local/cuda/... no checking for /usr/local/cuda/include... no checking for /usr/local/cuda/lib64... no checking for nvcc... /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc checking whether nvcc works... yes checking for cusolverDnZhegvdx_bufferSize in -lcusolver... no configure: WARNING: Using legacy custom solver. checking whether make sets $(MAKE)... yes checking whether Fortran files must be preprocessed... yes checking whether we are using the GNU Fortran 77 compiler... no checking whether nvfortran accepts -g... yes checking for library containing dgemm... no MKL not found in /opt/intel/mkl/lib/intel64: checking for library containing dgemm... -lmkl_intel_lp64 setting BLAS_LIBS... -L/opt/intel/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core checking FFT... checking MASS... checking for library containing mpi_init... none required checking ELPA...
Re: [QE-users] [ge-GPU] compiling q-e-gpu-qe-gpu-6.7
Dear Louis and Pietro, *With the config options as following:* *./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.0 --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp --with-scalapack=no* *results is:* checking build system type... x86_64-pc-linux-gnu checking ARCH... x86_64 checking setting AR... ... ar checking setting ARFLAGS... ... ruv checking whether the Fortran compiler works... yes checking for Fortran compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU Fortran compiler... no checking whether pgf90 accepts -g... yes configure: WARNING: F90 value is set to be consistent with value of MPIF90 checking for mpiifort... no checking for mpif90... mpif90 checking whether we are using the GNU Fortran compiler... no checking whether mpif90 accepts -g... yes checking version of mpif90... nvfortran 20.11-0 checking for Fortran flag to compile .f90 files... none setting F90... nvfortran setting MPIF90... mpif90 checking whether we are using the GNU C compiler... yes checking whether pgcc accepts -g... yes checking for pgcc option to accept ISO C89... none needed setting CC... pgcc setting CFLAGS... -fast -Mpreprocess using F90... nvfortran setting FFLAGS... -O1 setting F90FLAGS... $(FFLAGS) setting FFLAGS_NOOPT... -O0 setting CPP... cpp setting CPPFLAGS... -P -traditional -Uvector setting LD... mpif90 setting LDFLAGS... checking for Fortran flag to compile .f90 files... (cached) none checking whether Fortran compiler accepts -Mcuda=cuda11.0... yes checking for nvcc... /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc checking whether nvcc works... yes checking for cuInit in -lcuda... no configure: error: in `/codes/qe_6.7_GPU/q-e-gpu-qe-gpu-6.7': configure: error: Couldn't find libcuda See `config.log' for more details *Although the option --enable-cuda-env-check=no resulted in the configuration finished:* *./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp --with-scalapack=no* checking build system type... x86_64-pc-linux-gnu checking ARCH... x86_64 checking setting AR... ... ar checking setting ARFLAGS... ... ruv checking whether the Fortran compiler works... yes checking for Fortran compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU Fortran compiler... no checking whether pgf90 accepts -g... yes configure: WARNING: F90 value is set to be consistent with value of MPIF90 checking for mpiifort... no checking for mpif90... mpif90 checking whether we are using the GNU Fortran compiler... no checking whether mpif90 accepts -g... yes checking version of mpif90... nvfortran 20.11-0 checking for Fortran flag to compile .f90 files... none setting F90... nvfortran setting MPIF90... mpif90 checking whether we are using the GNU C compiler... yes checking whether pgcc accepts -g... yes checking for pgcc option to accept ISO C89... none needed setting CC... pgcc setting CFLAGS... -fast -Mpreprocess using F90... nvfortran setting FFLAGS... -O1 setting F90FLAGS... $(FFLAGS) setting FFLAGS_NOOPT... -O0 setting CPP... cpp setting CPPFLAGS... -P -traditional -Uvector setting LD... mpif90 setting LDFLAGS... checking for Fortran flag to compile .f90 files... (cached) none checking whether Fortran compiler accepts -Mcuda=cuda11.0... yes checking for /usr/local/cuda/... no checking for /usr/local/cuda/include... no checking for /usr/local/cuda/lib64... no checking for nvcc... /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc checking whether nvcc works... yes checking for cusolverDnZhegvdx_bufferSize in -lcusolver... no configure: WARNING: Using legacy custom solver. checking whether make sets $(MAKE)... yes checking whether Fortran files must be preprocessed... yes checking whether we are using the GNU Fortran 77 compiler... no checking whether nvfortran accepts -g... yes checking for library containing dgemm... no MKL not found in /opt/intel/mkl/lib/intel64: checking for library containing dgemm... -lmkl_intel_lp64 setting BLAS_LIBS... -L/opt/intel/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core checking FFT... checking MASS... checking for library containing mpi_init... none required checking ELPA... checking BEEF... -lbeef setting BEEF_LIBS... $(TOPDIR)/LIBBEEF/libbeef.a checking for ranlib... ranlib checking for wget... wget -O setting WGET... wget -O setting DFLAGS... -D__CUDA -D__DFTI -D__MPI setting IFLAGS... -I$(TOPDIR)/include -I$(TOPDIR)/FoX/finclude -I/opt/intel/mkl/include configure: creating ./config.status config.status: creating install/make_lapack.inc config.status: creating include/configure.h config.status: creating make.inc conf
Re: [QE-users] [ge-GPU] compiling q-e-gpu-qe-gpu-6.7
Dear Mohammad, CUDA may be installed somewhere else, anyway, if you want to skip the environment check, you may configure QE with this command ./configure FC=pgf90 F90=pgf90 CC=pgcc --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=SETME --with-cuda-cc=70 --enable-openmp Remember to set the cuda runtime according to what is provided by your setup. Hope this helps, best, Pietro --- Pietro Bonfà University of Parma On 12/16/20 9:10 AM, Mohammad Moaddeli wrote: Dear all, I am trying to compile the 6.7 version of the code using PGI 2020. I followed these steps: *1) NVIDIA driver (NVIDIA-Linux-x86_64-450.80.02.rpm) is installed.* *the output of nvidia-smi:* Wed Dec 16 09:07:11 2020 +-+ | NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 | |---+--+--+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===+==+==| | 0 TITAN V Off | :06:00.0 Off | N/A | | 27% 37C P0 32W / 250W | 0MiB / 12066MiB | 0% Default | | | | N/A | +---+--+--+ | 1 TITAN V Off | :07:00.0 Off | N/A | | 25% 37C P0 35W / 250W | 0MiB / 12066MiB | 0% Default | | | | N/A | +---+--+--+ +-+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=| | No running processes found | +-+ *The output of pgaccelinfo:* CUDA Driver Version: 11000 NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.80.02 Wed Sep 23 01:13:39 UTC 2020 Device Number: 0 Device Name: TITAN V Device Revision Number: 7.0 Global Memory Size: 12652838912 Number of Multiprocessors: 80 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate: 1455 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels: Yes ECC Enabled: No Memory Clock Rate: 850 MHz Memory Bus Width: 3072 bits L2 Cache Size: 4718592 bytes Max Threads Per SMP: 2048 Async Engines: 7 Unified Addressing: Yes Managed Memory: Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch: Yes Multi-Device: Yes Default Target: cc70 Device Number: 1 Device Name: TITAN V Device Revision Number: 7.0 Global Memory Size: 12652838912 Number of Multiprocessors: 80 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate: 1455 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels: Yes ECC Enabled: No Memory Clock Rate: 850 MHz Memory Bus Width: 3072 bits L2 Cache Size: 4718592 bytes Max Threads Per SMP: 2048 Async Engines: 7 Unified A
[QE-users] [ge-GPU] compiling q-e-gpu-qe-gpu-6.7
Dear all, I am trying to compile the 6.7 version of the code using PGI 2020. I followed these steps: *1) NVIDIA driver (NVIDIA-Linux-x86_64-450.80.02.rpm) is installed.* *the output of nvidia-smi:* Wed Dec 16 09:07:11 2020 +-+ | NVIDIA-SMI 450.80.02Driver Version: 450.80.02CUDA Version: 11.0 | |---+--+--+ | GPU NamePersistence-M| Bus-IdDisp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===+==+==| | 0 TITAN V Off | :06:00.0 Off | N/A | | 27% 37CP032W / 250W | 0MiB / 12066MiB | 0% Default | | | | N/A | +---+--+--+ | 1 TITAN V Off | :07:00.0 Off | N/A | | 25% 37CP035W / 250W | 0MiB / 12066MiB | 0% Default | | | | N/A | +---+--+--+ +-+ | Processes: | | GPU GI CIPID Type Process name GPU Memory | |ID ID Usage | |=| | No running processes found | +-+ *The output of pgaccelinfo:* CUDA Driver Version: 11000 NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.80.02 Wed Sep 23 01:13:39 UTC 2020 Device Number: 0 Device Name: TITAN V Device Revision Number:7.0 Global Memory Size:12652838912 Number of Multiprocessors: 80 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate:1455 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels:Yes ECC Enabled: No Memory Clock Rate: 850 MHz Memory Bus Width: 3072 bits L2 Cache Size: 4718592 bytes Max Threads Per SMP: 2048 Async Engines: 7 Unified Addressing:Yes Managed Memory:Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch:Yes Multi-Device:Yes Default Target:cc70 Device Number: 1 Device Name: TITAN V Device Revision Number:7.0 Global Memory Size:12652838912 Number of Multiprocessors: 80 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate:1455 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels:Yes ECC Enabled: No Memory Clock Rate: 850 MHz Memory Bus Width: 3072 bits L2 Cache Size: 4718592 bytes Max Threads Per SMP: 2048 Async Engines: 7 Unified Addressing:Yes Managed Memory:Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch:Yes Multi-Device:Yes Default Target:cc70 *2) PGI compiler is installed:* *yum install nvhpc-20-11-20.11-1.x86_64.rpm nvhpc-2020-20.11-1.x86_64.rpm* *PATHs that are set in ~/.bashrc file:* export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/bin:$PATH export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/include:$PATH export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/lib64:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/extras/CUP