Dear QE users & developers, We are happy to announce that the first beta GPU-enabled release of Quantum ESPRESSO (QE) has been committed today in the official repository.
You can download the new version of the code using the following command: $ svn checkout svn://scm.qe-forge.org/scmrepos/svn/q-e/branches/espresso-PRACE The Irish Centre for High-End Computing (ICHEC, www.ichec.ie <http://www.ichec.ie>) has been mainly responsible for extending the QE suite to enhance calculations on NVIDIA GPUs. The porting activity has been supported within the PRACE 1st Implementation Phase project. It is currently carried out through the Sub-task "Accelerator", led by ICHEC, within the Work-Package "Programming Techniques for High-Performance Applications" in collaboration with CINECA. The porting activity is concerned mainly with the PWscf package. But ICHEC and the Irish QE user community are interested in exploring any other initiatives which come forward from QE users or developers interested in porting on GPGPU architecture any of the QE suite related codes. We have successfully accelerated the linear algebra part of the QE suite using a library called phiGEMM, some explicit computational kernels (newd, addusdense, vloc_psi) and the 3D FFT for the single CPU/GPU version. Both linear algebra (matrix multiplication) and the FFT accelerated version make use of CUDA libraries. The porting is mainly based on wrappers that permit the use of libraries for accelerators. The distributed 3D FFT version is still in progress, since this porting requires important changes of the current structure of the code and data distribution. While running the parallel and distributed multi-GPUs version it still uses the original 3D FFT implementations. The phiGEMM library is distributed as an independent open-source external package together with the Quantum ESPRESSO suite. It aims to perform matrix multiplication ([SDZ]GEMM) taking advantage of the underlying BLAS kernel functions on both CPU and NVIDIA CUDA-based GPU, mixing and overlapping computation between the host (CPU) and the accelerator (GPU). Whatever code makes intensive use of GEMM it can derive a significant advantage linking this library when running on a CPU/GPU hybrid system. Even if the 3D FFT is accelerated only for a single CPU process (not when using MPI), other parts are already enabled to take advantage of a distributed parallel hybrid system. All of this allows PWscf to potentially use distributed message-passing parallelization (MPI) plus multi-threading (OpenMP) plus accelerators (NVIDIA GPUs) all together and produce good performance enhancement using the latest version of NVIDIA GPUs (high performance double precision is needed). This porting activity is still in progress, especially the parallel 3D FFT component that represents a bottleneck for large calculations. We have been testing this beta release using some small/medium benchmarks used in the DEISA official bench-suite and several GPU hardware (Tesla and Fermi architectures). Special thanks goes to both E4 Computer Engineering and CEA for providing access to hybrid GPU systems with differing configurations to those available at ICHEC. We look forward with interest to receiving any suggestions for improvement, feedback or request for collaboration by anyone who is interested to try and validate PWscf CUDA version on different platforms using different scientific cases.For additional information please contact qe-gpu at ichec.ie or replay at this mail. We'll be shortly available a dedicated forum q-e-gpgpu at qe-forge.org <http://qe-forge.org/mail/?group_id=10>. Please use this list for bug report and any other issue related to the use of the PWscf GPU version. Although compilation of the GPU implementation is fairly straight-forward, we kindly suggest that users carefully read the README.GPU that is included. The intrinsic characteristics of hybrid multi- and many-core systems require careful consideration to best exploit the available computing power. Any and all suggestions are welcome and will be very much appreciated. Ivan Girotto & Filippo Spiga --- ICHEC GPU developer team The Tower - 7th floor Trinity Technology& Enterprise Campus Grand Canal Quay - Dublin 2 - Ireland +353-1-5241608 (ph) / +353-1-7645845 (fax) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.democritos.it/pipermail/pw_forum/attachments/20110505/8af10ede/attachment-0001.htm
