Re: [gmx-users] 2019.2 not using all available cores
I've discovered an option that caused 2019.2 to use all of the cores correctly. Use "-pin on" and it works as expected, using all 12 cores, CPU load being show as appropriate (gets up to 68% total CPU utilisation) Use "-pin auto", which is the default, or "-pin off" and it will only use a single core (maximum is 8% total CPU utilisation). Catch ya, Dr. Dallas Warren Drug Delivery, Disposition and Dynamics Monash Institute of Pharmaceutical Sciences, Monash University 381 Royal Parade, Parkville VIC 3052 dallas.war...@monash.edu - When the only tool you own is a hammer, every problem begins to resemble a nail. On Thu, 9 May 2019 at 07:54, Dallas Warren wrote: > gmx 2019.2 compiled using threads only uses a single core mdrun_mpi > compiled using MPI only uses a single core, gmx 2016.3 using threads > uses all 12 cores. > > For compiling thread version of 2019.2 used: > cmake .. -DGMX_GPU=ON > -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2 > > For compiling MPI version of 2019.2 used: > cmake .. -DGMX_MPI=ON -DBUILD_SHARED_LIBS=OFF -DGMX_GPU=ON > -DCMAKE_CXX_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpiCC > -DCMAKE_C_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpicc > -DGMX_BUILD_MDRUN_ONLY=ON > -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2 > > Between building both to of those, deleted the build directory. > > > GROMACS: gmx, version 2019.2 > Executable: /usr/local/gromacs/gromacs-2019.2/bin/gmx > Data prefix: /usr/local/gromacs/gromacs-2019.2 > Working dir: /home/dallas/experiments/current/19-064/P6DLO > Command line: > gmx -version > > GROMACS version:2019.2 > Precision: single > Memory model: 64 bit > MPI library:thread_mpi > OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) > GPU support:CUDA > SIMD instructions: AVX_256 > FFT library:fftw-3.3.8-sse2 > RDTSCP usage: enabled > TNG support:enabled > Hwloc support: disabled > Tracing support:disabled > C compiler: /usr/bin/cc GNU 7.4.0 > C compiler flags:-mavx -O3 -DNDEBUG -funroll-all-loops > -fexcess-precision=fast > C++ compiler: /usr/bin/c++ GNU 7.4.0 > C++ compiler flags: -mavx-std=c++11 -O3 -DNDEBUG > -funroll-all-loops -fexcess-precision=fast > CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda > compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on > Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1, > V10.1.105 > CUDA compiler > flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;; > ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; > CUDA driver:10.10 > CUDA runtime: 10.10 > > > GROMACS: mdrun_mpi, version 2019.2 > Executable: /usr/local/gromacs/gromacs-2019.2/bin/mdrun_mpi > Data prefix: /usr/local/gromacs/gromacs-2019.2 > Working dir: /home/dallas/experiments/current/19-064/P6DLO > Command line: > mdrun_mpi -version > > GROMACS version:2019.2 > Precision: single > Memory model: 64 bit > MPI library:MPI > OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) > GPU support:CUDA > SIMD instructions: AVX_256 > FFT library:fftw-3.3.8-sse2 > RDTSCP usage: enabled > TNG support:enabled > Hwloc support: disabled > Tracing support:disabled > C compiler: /usr/lib64/mpi/gcc/openmpi/bin/mpicc GNU 7.4.0 > C compiler flags:-mavx -O3 -DNDEBUG -funroll-all-loops > -fexcess-precision=fast > C++ compiler: /usr/lib64/mpi/gcc/openmpi/bin/mpiCC GNU 7.4.0 > C++ compiler flags: -mavx-std=c++11 -O3 -DNDEBUG > -funroll-all-loops -fexcess-precision=fast > CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda > compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on > Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1, > V10.1.105 > CUDA compiler > flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;; > ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; > CUDA driver:10.10 > CUDA runtime: 10.10 > > > /usr/local/gromacs/gromacs-2016.3/bin/gmx -version > >
[gmx-users] 2019.2 not using all available cores
gmx 2019.2 compiled using threads only uses a single core mdrun_mpi compiled using MPI only uses a single core, gmx 2016.3 using threads uses all 12 cores. For compiling thread version of 2019.2 used: cmake .. -DGMX_GPU=ON -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2 For compiling MPI version of 2019.2 used: cmake .. -DGMX_MPI=ON -DBUILD_SHARED_LIBS=OFF -DGMX_GPU=ON -DCMAKE_CXX_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpiCC -DCMAKE_C_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpicc -DGMX_BUILD_MDRUN_ONLY=ON -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2 Between building both to of those, deleted the build directory. GROMACS: gmx, version 2019.2 Executable: /usr/local/gromacs/gromacs-2019.2/bin/gmx Data prefix: /usr/local/gromacs/gromacs-2019.2 Working dir: /home/dallas/experiments/current/19-064/P6DLO Command line: gmx -version GROMACS version:2019.2 Precision: single Memory model: 64 bit MPI library:thread_mpi OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support:CUDA SIMD instructions: AVX_256 FFT library:fftw-3.3.8-sse2 RDTSCP usage: enabled TNG support:enabled Hwloc support: disabled Tracing support:disabled C compiler: /usr/bin/cc GNU 7.4.0 C compiler flags:-mavx -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /usr/bin/c++ GNU 7.4.0 C++ compiler flags: -mavx-std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1, V10.1.105 CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;; ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver:10.10 CUDA runtime: 10.10 GROMACS: mdrun_mpi, version 2019.2 Executable: /usr/local/gromacs/gromacs-2019.2/bin/mdrun_mpi Data prefix: /usr/local/gromacs/gromacs-2019.2 Working dir: /home/dallas/experiments/current/19-064/P6DLO Command line: mdrun_mpi -version GROMACS version:2019.2 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64) GPU support:CUDA SIMD instructions: AVX_256 FFT library:fftw-3.3.8-sse2 RDTSCP usage: enabled TNG support:enabled Hwloc support: disabled Tracing support:disabled C compiler: /usr/lib64/mpi/gcc/openmpi/bin/mpicc GNU 7.4.0 C compiler flags:-mavx -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast C++ compiler: /usr/lib64/mpi/gcc/openmpi/bin/mpiCC GNU 7.4.0 C++ compiler flags: -mavx-std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1, V10.1.105 CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;; ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast; CUDA driver:10.10 CUDA runtime: 10.10 /usr/local/gromacs/gromacs-2016.3/bin/gmx -version GROMACS: gmx, version 2016.3 Executable: /usr/local/gromacs/gromacs-2016.3/bin/gmx Data prefix: /usr/local/gromacs/gromacs-2016.3 Working dir: /home/dallas/experiments/current/19-064/P6DLO Command line: gmx -version GROMACS version:2016.3 Precision: single Memory model: 64 bit MPI library:thread_mpi OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32) GPU support:CUDA SIMD instructions: AVX_256 FFT library:fftw-3.3.8-sse2 RDTSCP usage: enabled TNG support:enabled Hwloc support: disabled Tracing support:disabled Built on: Tue Mar 21 13:21:15 AEDT 2017 Built by: dallas@morph [CMAKE] Build OS/arch: Linux 4.4.49-16-default x86_64 Build CPU vendor: Intel Build CPU brand:Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Build CPU family: 6 Model: 45 Stepping: 7 Build CPU