Dear Gromacs user, Using a machine with below configurations and also below command I tried to simulate a system with 479K atoms (mainly water) on CPU-GPU, the performance is around 1ns per 1 hour. According the information and also shared log file below, I would be so appreciated if you could comment on the submission command to improve the performance by involving better the GPU and CPU.
%------------------------------------------------ #PBS -l select=4:ncpus=22:mpiprocs=22:ngpus=1 export OMP_NUM_THREADS=4 aprun -n 88 gmx_mpi mdrun -deffnm out -s out.tpr -g out.log -v -dlb yes -gcom 1 -nb gpu -npme 44 -ntomp 4 -ntomp_pme 6 -tunepme yes Running on 4 nodes with total 88 cores, 176 logical cores, 4 compatible GPUs Cores per node: 22 Logical cores per node: 44 Compatible GPUs per node: 1 All nodes have identical type(s) of GPUs %------------------------------------------------ GROMACS version: 2018.1 Precision: single Memory model: 64 bit MPI library: MPI OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support: CUDA SIMD instructions: AVX2_256 FFT library: commercial-fftw-3.3.6-pl1-fma-sse2-avx-avx2-avx2_128 RDTSCP usage: enabled TNG support: enabled Hwloc support: hwloc-1.11.0 Tracing support: disabled Built on: 2018-09-12 20:34:33 Built by: xxxx Build OS/arch: Linux 3.12.61-52.111-default x86_64 Build CPU vendor: Intel Build CPU brand: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Build CPU family: 6 Model: 79 Stepping: 1 Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic C compiler: /opt/cray/pe/craype/2.5.13/bin/cc GNU 5.3.0 C compiler flags: -march=core-avx2 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /opt/cray/pe/craype/2.5.13/bin/CC GNU 5.3.0 C++ compiler flags: -march=core-avx2 -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /opt/nvidia/cudatoolkit8.0/8.0.61_2.3.13_g32c34f9-2.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on Tue_Jan_10_13:22:03_CST_2017;Cuda compilation tools, release 8.0, V8.0.61 CUDA compiler flags:-gencode;arch=compute_60,code=sm_60;-use_fast_math;-Wno-deprecated-gpu-targets;;; ;-march=core-avx2;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver: 9.20 CUDA runtime: 8.0 %------------------------------------------------- Log file: https://drive.google.com/open?id=1-myQ5rP85UWKb1262QDPa6kYhuzHPzLu Thank you, Alex -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to [email protected].
