Hello, We are writing a section on GPUs in the documentation, but until it is ready you can use the ideas below:
There are two ways to take advantage of GPUs (enabled only for the solver stage, which typically takes up the most time): -- Using the ELPA library and its native interface in Siesta (this method is available for Siesta versions 4.1.5 and up) -- Using the ELSI library (for Siesta "MaX" versions (see the Guide to Siesta Versions in https://gitlab.com/siesta-project/siesta/-/wikis/Guide-to-Siesta-versions) In both cases the special installation instructions involve only enabling GPU support in either ELPA or ELSI, and using the proper options in Siesta. For the first method the fdf options to enable GPUs are (example): diag-algorithm elpa-2 diag-elpa-usegpu T diag-blocksize 16 # Optional number-of-eigenstates 17320 use-tree-timer T For the second (ELSI) method: solution-method elsi elsi-solver elpa elsi-elpa-gpu 1 elsi-elpa-flavor 2 # Optional number-of-eigenstates 17320 use-tree-timer T elsi-output-level 3 The installation of ELPA and ELSI with GPU support is system-specific, but you can get inspiration from the following examples: * ELPA (on Marconi-100 at CINECA, with IBM P9 chips and nVidia A100 GPUs, using the gcc compiler): Script to configure: #!/bin/sh # (Need to define properly the symbols used below) # Note that the P9 does not use the typical Intel kernels FC=mpifort CC=mpicc CXX=mpic++ \ CFLAGS="-O3 -mcpu=native -std=c++11" \ FCFLAGS="-O3 -mcpu=native -ffree-line-length-none" LDFLAGS="${SCALAPACK_LIBS} ${LAPACK_LIBS}" \ ../configure \ --with-cuda-path=${CUDA_HOME} \ --with-cuda-sdk-path=${CUDA_HOME} \ --enable-nvidia-gpu --with-NVIDIA-GPU-compute-capability=sm_70 \ --enable-NVIDIA-gpu-memory-debug --enable-nvtx \ --disable-sse-assembly --disable-sse --disable-avx --disable-avx2 --disable-avx512 \ --enable-c-tests=no --prefix=$PRJ/bin/gcc/elpa/2021.05.002.jul22 (Adapt the options to your system) * ELSI SET(CMAKE_INSTALL_PREFIX "$ENV{BASE_DIR}/elsi/2.6.2" CACHE STRING "Installation dir") SET(CMAKE_Fortran_COMPILER "mpif90" CACHE STRING "MPI Fortran compiler") SET(CMAKE_C_COMPILER "mpicc" CACHE STRING "MPI C compiler") SET(CMAKE_CXX_COMPILER "mpicxx" CACHE STRING "MPI C++ compiler") SET(CMAKE_Fortran_FLAGS "-O2 -g -fbacktrace -fdump-core" CACHE STRING "Fortran flags") SET(CMAKE_C_FLAGS "-O2 -g -std=c99" CACHE STRING "C flags") SET(CMAKE_CXX_FLAGS "-O2 -g -std=c++11" CACHE STRING "C++ flags") SET(CMAKE_CUDA_FLAGS "-O3 -arch=sm_70 -std=c++11" CACHE STRING "CUDA flags") # Workaround: specify -std=c++11 in CMAKE_CUDA_FLAGS to avoid __ieee128 gcc/cuda bug SET(USE_GPU_CUDA ON CACHE BOOL "Use CUDA-based GPU acceleration in ELPA") SET(ENABLE_PEXSI ON CACHE BOOL "Enable PEXSI") SET(ENABLE_TESTS ON CACHE BOOL "Enable tests") #SET(ADD_UNDERSCORE OFF CACHE BOOL "Do not suffix C functions with an underscore") SET(LIB_PATHS "/cineca/prod/opt/libraries/lapack/3.9.0/gnu--8.4.0/lib;/cineca/prod/opt/libraries/scalapack/2.1.0/spectrum_mpi--10.3.1--binary/lib;/cineca/prod/opt/compilers/cuda/11.0/none/lib64;/cineca/prod/opt/libraries/essl/6.2.1/binary/lib64" CACHE STRING "External library paths") SET(LIBS "scalapack;lapack;essl;cublas;cudart" CACHE STRING "External libraries") You should modify appropriately the location and version numbers of your libraries. Finally, a note about the importance of the proper execution incantation, for "pinning" the MPI ranks to the appropriate GPU: (There are probably better and more streamlined ways to do this) For this example I use the 32 cores (2x16) in Marconi for MPI tasks, no OpenMP, and do not take advantage of the 4x hyperthreading. The slurm script I typically use is: (gcc_env et al are my own Lmod modules) ============================================================================= #!/bin/bash #SBATCH --job-name=test-covid #SBATCH --account=Pra19_MaX_1 #SBATCH --partition=m100_usr_prod #SBATCH --output=mpi_%j.out #SBATCH --error=mpi_%j.err #SBATCH --nodes=8 #SBATCH --ntasks-per-node=32 #SBATCH --ntasks-per-socket=16 #SBATCH --cpus-per-task=4 #SBATCH --gres=gpu:4 #SBATCH --time=00:19:00 # ml purge ml gcc_env ml siesta-max/1.0-14 # date which siesta echo "-------------------" # export OMP_NUM_THREADS=1 # mpirun --map-by socket:PE=1 --rank-by core --report-bindings \ -np ${SLURM_NTASKS} ./gpu_bind.sh \ siesta covid.fdf ============================================================================= The crucial part is the gpu_bind.sh script, which contains code to make sure that each socket talks to the right GPUs (1st socket, GPU0/GPU1), 2nd socket, GPU2/GPU3), and within each socket, the first 8 tasks use GPU0/2 and the second group of 8 tasks use GPU1/3. For this, the tasks have to be ordered. (This is specific to Marconi). I found that using the --map-by socket:PE=1 --rank-by-core incantation I could achieve that ordering. The contents of gpu_bind.sh are: ==================================================== #!/bin/bash np_node=$OMPI_COMM_WORLD_LOCAL_SIZE rank=$OMPI_COMM_WORLD_LOCAL_RANK block=$(( $np_node / 4 )) # We have 4 GPUs # If np_node is 32 (typical), then block=8 limit0=$(( $block * 1 )) limit1=$(( $block * 2 )) limit2=$(( $block * 3 )) limit3=$(( $block * 4 )) #----------------- if [ $rank -lt $limit0 ] then export CUDA_VISIBLE_DEVICES=0 elif [ $rank -lt $limit1 ] then export CUDA_VISIBLE_DEVICES=1 elif [ $rank -lt $limit2 ] then export CUDA_VISIBLE_DEVICES=2 else export CUDA_VISIBLE_DEVICES=3 fi $@ ==================================================== I hope this helps. Best regards, Alberto ----- El 28 de Junio de 2022, a las 10:28, Mohammed Ghadiyali mohammed.ghadiy...@kaust.edu.sa escribió: | Hello, | | I’ve went Q&A available on max-center website and as per it Siesta can use | GPU’s. However I’m not able to find any documentation on installation. So can | some one inform me the procedure for installing Siesta on GPU’s. Our systems | have 8xV100 (32 GB each) with NVLink. | | Regards, | Ghadiyali Mohammed Kader | Post Doctoral Fellow | King Abdullah University of Science and Technology | | | This message and its contents, including attachments are intended solely for the | original recipient. If you are not the intended recipient or have received this | message in error, please notify me immediately and delete this message from | your computer system. Any unauthorized use or distribution is prohibited. | Please consider the environment before printing this email. | | | -- | SIESTA is supported by the Spanish Research Agency (AEI) and by the European | H2020 MaX Centre of Excellence (http://www.max-centre.eu/)
-- SIESTA is supported by the Spanish Research Agency (AEI) and by the European H2020 MaX Centre of Excellence (http://www.max-centre.eu/)