[gmx-users] Thread-MPI error in GROMACS-2018
Hello, I have come across an error that causes GROMACS (2018/2018.1) to crash. The message is: "tMPI error: Receive buffer size too small for transmission (in valid comm) Aborted" The error seems to only occur immediately following a LINCS or SETTLE warning. The error is reproducible across different systems. A simple example system is running an energy minimization on a box of 1000 rigid TIP4P/Ice water molecules generated with gmx solvate. When SETTLE is used as the constraint algorithm, there are several SETTLE warnings in the early steps of the energy minimization, and GROMACS will crash with the above error message. If I replace SETTLE with LINCS, GROMACS crashes with the same error message following a LINCS warning. Other systems that have produced this error are -OH terminated self assembled monolayer surfaces (h-bonds constrained by LINCS), and mica surfaces (h-bonds constrained by LINCS). Naturally, reducing -ntmpi to 1 eliminates the error for all cases. The problem does appear to be hardware dependent. Specifically, the tested node(s) on the cluster contains K20/K40 GPUs with Intel Xeon E5-2680v3 processor (20/24 cores). I used GCC/5.4.0 and CUDA/8.0.44 compilers for installing GROMACS. An installation on my desktop machine with with very similar options does not have the thread MPI error. Example of procedure that causes error: # Node contains 24 cores and 2 K40 GPUs gmx solvate -cs tip4p -o box.gro -box 3.2 3.2 3.2 -maxsol 1000 gmx grompp -f em.mdp -c box.gro -p tip4pice.top -o em export OMP_NUM_THREADS=6 gmx mdrun -v -deffnm em -ntmpi 4 -ntomp 6 -pin on Attached are the relevant topology (tip4pice.top), mdp (em.mdp), and log (em.log) files. Thanks in advance for any ideas as to what might be causing this problem, Siva Dasetty tip4pice.top <https://drive.google.com/a/g.clemson.edu/file/d/13e_rxBMNaizR1GvCVklc3IlQ1gcCTat_/view?usp=drive_web> em.mdp <https://drive.google.com/a/g.clemson.edu/file/d/1A1592_cB7jfwdBOcezkfsFnT4xPqksNB/view?usp=drive_web> em.log <https://drive.google.com/a/g.clemson.edu/file/d/15iU3364SwxpEx3popQf7elZ9_3p2r6v9/view?usp=drive_web> -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Affinity setting for 1/16 threads failed. Version 5.0.2
Gordon: Command line: mdrun_mpi --version Gromacs version:VERSION 5.0.2 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled GPU support:disabled invsqrt routine:gmx_software_invsqrt(x) SIMD instructions: AVX_256 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: disabled TNG support:enabled Tracing support:disabled Built on: Sun Oct 19 14:11:10 PDT 2014 Built by: r...@gcn-20-88.sdsc.edu [CMAKE] Build OS/arch: Linux 2.6.32-431.29.2.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz Build CPU family: 6 Model: 45 Stepping: 6 Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic C compiler: /opt/mvapich2/intel/ib/bin/mpicc Intel 13.0.0.20121010 C compiler flags:-mavx-std=gnu99 -w3 -wd111 -wd177 -wd181 -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 -wd593 -wd869 -wd981 -wd1418 -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 -wd2557 -wd3280 -wd3346 -wd11074 -wd11076 -O3 -DNDEBUG -ip -funroll-all-loops -alias-const -ansi-alias C++ compiler: /opt/mvapich2/intel/ib/bin/mpicxx Intel 13.0.0.20121010 C++ compiler flags: -mavx-w3 -wd111 -wd177 -wd181 -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 -wd593 -wd869 -wd981 -wd1418 -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 -wd2557 -wd3280 -wd3346 -wd11074 -wd11076 -wd1782 -wd2282 -O3 -DNDEBUG -ip -funroll-all-loops -alias-const -ansi-alias Boost version: 1.55.0 (internal) Our Cluster: Command line: mdrun --version Gromacs version:VERSION 5.0.2 Precision: single Memory model: 64 bit MPI library:MPI OpenMP support: enabled GPU support:disabled invsqrt routine:gmx_software_invsqrt(x) SIMD instructions: SSE4.1 FFT library:fftw-3.3.3-sse2 RDTSCP usage: enabled C++11 compilation: enabled TNG support:enabled Tracing support:disabled Built on: Sun Mar 8 19:06:35 EDT 2015 Built by: sdasett@user001 [CMAKE] Build OS/arch: Linux 2.6.32-358.23.2.el6.x86_64 x86_64 Build CPU vendor: GenuineIntel Build CPU brand:Intel(R) Xeon(R) CPU X7542 @ 2.67GHz Build CPU family: 6 Model: 46 Stepping: 6 Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pdcm popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 x2apic C compiler: /software/openmpi/1.8.1_gcc/bin/mpicc GNU 4.8.1 C compiler flags:-msse4.1-Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -O3 -DNDEBUG -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds C++ compiler: /software/openmpi/1.8.1_gcc/bin/mpicxx GNU 4.8.1 C++ compiler flags: -msse4.1 -std=c++0x -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -O3 -DNDEBUG -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds Boost version: 1.55.0 (internal) Thanks, On Wed, Nov 18, 2015 at 11:05 AM, Szilárd Páll <pall.szil...@gmail.com> wrote: > I don't see much similarity - except the type of error - between your issue > and the one reported on redmine 1184. In that case Intel Harpertown (I > assume Grodon is much newer), with thread-MPI was used. > > -- > Szilárd > > On Wed, Nov 18, 2015 at 4:45 PM, Siva Dasetty <sdas...@g.clemson.edu> > wrote: > > > Thank you Mark. Yes I have already taken this issue to gordon and while > am > > awaiting for their response I am just wondering if this issue has > anything > > to do with the bug#1184: http://redmine.gromacs.org/issues/1184. > > > > > > Thank you again for your quick response. > > > > On Wed, Nov 18, 2015 at 10:31 AM, Mark Abraham <mark.j.abra...@gmail.com > > > > wrote: > > > > > Hi, > > > > > > These are good issues to take up with the support staff of gordon. > mdrun > > > tries to be a good citizen and by default stays out of the way if some > > > other part of the software stack is already managing process affinity. > As > > > you can see, doing it right is crucial for good performance. But mdrun > > -pin > > > on always works everywhere we know about. > > > > > > Mark > > > > > > On Wed, Nov 18, 2015 at 3:28 PM Siva Dasetty <sdas...@g.clemson.edu> > > > wrote: > > > > > > > Dear all, > > > > > > > > I am running simulations using version 5.0.2 (default in gordon)
[gmx-users] Exclusions section in topology.
Dear all, As the Verlet cut-off scheme does not support energygrp-excl parameter yet, we are looking for an alternative to exclude all non bonded interactions between the frozen atoms. The gromacs manual (section 5.4) states that all the non bonded interactions between the pairs in the [ exclusions ] section of the topology file will be excluded. In order to test this we have considered a small system comprising 13 charged atoms (oplsAA-ff) and ran a NVE simulation where all the atoms are frozen. We did the simulation with and without including the [ exclusions ] section in the topology file. We found that all the LJ -SR interactions are indeed 0 but the Coulomb-SR interactions are not exactly zero. Note that we are not using PME for the coulomb type and there is no solvent. In order to understand the contribution of each atom to the Coulomb-SR we have created energy groups of some of the atoms and ran the simulations using only CPUs (-nb cpu). Below are the average energies obtained using g_energy tool with and without exclusions. Energy Average ( with exclusions ) Average (without exclusions) --LJ (SR) 0863710Coulomb (SR) -7.63E-06-619.943Coul-SR:O1-O1-10.6078-10.6078LJ-SR:O1-O1 00Coul-SR:O1-O2-36.58106.48LJ-SR:O1-O2 011.4136Coul-SR:O1-H1 7.51731-22.0658LJ-SR:O1-H1 072.0831Coul-SR:O1-rest 50.2783-191.784 LJ-SR:O1-rest035631.6Coul-SR:O2-O2-31.5359-31.5359LJ-SR:O2-O2 0 0Coul-SR:O2-H112.9614-45.7467LJ-SR:O2-H1 03023.75Coul-SR:O2-rest 86.6905-555.719LJ-SR:O2-rest043085.1Coul-SR:H1-H1-1.331810.0595622 LJ-SR:H1-H1 0-0.00987022Coul-SR:H1-rest -17.815245.9682 LJ-SR:H1-rest0281888Coul-SR:rest-rest-59.57785.0095LJ-SR:rest-rest 0 48 Why is there a contribution of Coul-SR energies from some of the pairs? Also, there is only one O1 atom in the system. So what does Coul-SR O1:O1 mean? We have also included the corresponding pair [O1 O1] in the exclusions section. Below is a copy of the mdp file used. dt = 0.002 nsteps = 50 nstcomm = 100 comm-grps = System nstlist = 10 ns_type = grid nstxout = 0 nstvout = 0 nstfout = 0 nstlog = 2500 nstenergy = 2500 nstxout-compressed = 2500 compressed-x-grps = System cutoff-scheme = Verlet verlet-buffer-tolerance = -1 pbc = xyz rlist = 1.0 coulombtype = Cut-off rcoulomb= 1.0 rvdw= 1.0 vdwtype = Cut-off constraints = h-bonds constraint_algorithm= lincs freezegrps = GRO freezedim = Y Y Y energygrps = O1 O2 H1 Thanks in advance for your help, -- Siva Dasetty -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Query regarding exclusions and Coulomb-SR energies
Dear all, As the Verlet cut-off scheme does not support energygrp-excl parameter yet, we are looking for an alternative to exclude all non bonded interactions between the frozen atoms. The gromacs manual (section 5.4) states that all the non bonded interactions between the pairs in the [ exclusions ] section of the topology file will be excluded. In order to test this we have considered a small system comprising 13 charged atoms (oplsAA-ff) and ran a NVE simulation where all the atoms are frozen. We did the simulation with and without including the [ exclusions ] section in the topology file. We found that all the LJ -SR interactions are indeed 0 but the Coulomb-SR interactions are not exactly zero. Note that we are not using PME for the coulomb type and there is no solvent. In order to understand the contribution of each atom to the Coulomb-SR we have created energy groups of some of the atoms and ran the simulations using only CPUs (-nb cpu). Below are the average energies obtained using g_energy tool with and without exclusions. Since the Verlet cut-off scheme only supports full pbc or pbc=xy we are guessing the energies that reports Coul/LJ- SR O1-O1 are the energies of O1 with its periodic image (there is only one O1 atom in the system). Is this right? Note that the exclusions section also contains identical pairs (O1 O1). Energy Average ( with exclusions ) Average (without exclusions) --LJ (SR) 0863710Coulomb (SR) -7.63E-06-619.943Coul-SR:O1-O1-10.6078-10.6078LJ-SR:O1-O1 00Coul-SR:O1-O2-36.58106.48LJ-SR:O1-O2 011.4136Coul-SR:O1-H1 7.51731-22.0658LJ-SR:O1-H1 072.0831Coul-SR:O1-rest 50.2783-191.784 LJ-SR:O1-rest035631.6Coul-SR:O2-O2-31.5359-31.5359LJ-SR:O2-O2 0 0Coul-SR:O2-H112.9614-45.7467LJ-SR:O2-H1 03023.75Coul-SR:O2-rest 86.6905-555.719LJ-SR:O2-rest043085.1Coul-SR:H1-H1-1.331810.0595622 LJ-SR:H1-H1 0-0.00987022Coul-SR:H1-rest -17.815245.9682 LJ-SR:H1-rest0281888Coul-SR:rest-rest-59.57785.0095LJ-SR:rest-rest 0 48 Also the other question is why are the the contributions of some of the pairs to the Coulom-SR not zeros? Can we not completely eliminate the Coulomb-SR energies? Below is a copy of the mdp file used. dt = 0.002 nsteps = 50 nstcomm = 100 comm-grps = System nstlist = 10 ns_type = grid nstxout = 0 nstvout = 0 nstfout = 0 nstlog = 2500 nstenergy = 2500 nstxout-compressed = 2500 compressed-x-grps = System cutoff-scheme = Verlet verlet-buffer-tolerance = -1 pbc = xyz rlist = 1.0 coulombtype = Cut-off rcoulomb= 1.0 rvdw= 1.0 vdwtype = Cut-off constraints = h-bonds constraint_algorithm= lincs freezegrps = GRO freezedim = Y Y Y energygrps = O1 O2 H1 Thanks in advance for your help, -- Siva Dasetty -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Query regarding exclusions and Coulomb-SR energies
Dear all, As the Verlet cut-off scheme does not support energygrp-excl parameter yet, we are looking for an alternative to exclude all non bonded interactions between the frozen atoms. The gromacs manual (section 5.4) states that all the non bonded interactions between the pairs in the [ exclusions ] section of the topology file will be excluded. In order to test this we have considered a small system comprising 13 charged atoms (oplsAA-ff) and ran a NVE simulation where all the atoms are frozen. We did the simulation with and without including the [ exclusions ] section in the topology file. We found that all the LJ -SR interactions are indeed 0 but the Coulomb-SR interactions are not exactly zero. Note that we are not using PME for the coulomb type and there is no solvent. In order to understand the contribution of each atom to the Coulomb-SR we have created energy groups of some of the atoms and ran the simulations using only CPUs (-nb cpu). Below are the average energies obtained using g_energy tool with and without exclusions. Since the Verlet cut-off scheme only supports full pbc or pbc=xy we are guessing the energies that reports Coul/LJ- SR O1-O1 are the energies of O1 with its periodic image (there is only one O1 atom in the system). Is this right? Note that the exclusions section also contains identical pairs (O1 O1). Energy Average ( with exclusions ) Average (without exclusions) --LJ (SR) 0863710Coulomb (SR) -7.63E-06-619.943Coul-SR:O1-O1-10.6078-10.6078LJ-SR:O1-O1 00Coul-SR:O1-O2-36.58106.48LJ-SR:O1-O2 011.4136Coul-SR:O1-H1 7.51731-22.0658LJ-SR:O1-H1 072.0831Coul-SR:O1-rest 50.2783-191.784 LJ-SR:O1-rest035631.6Coul-SR:O2-O2-31.5359-31.5359LJ-SR:O2-O2 0 0Coul-SR:O2-H112.9614-45.7467LJ-SR:O2-H1 03023.75Coul-SR:O2-rest 86.6905-555.719LJ-SR:O2-rest043085.1Coul-SR:H1-H1-1.331810.0595622 LJ-SR:H1-H1 0-0.00987022Coul-SR:H1-rest -17.815245.9682 LJ-SR:H1-rest0281888Coul-SR:rest-rest-59.57785.0095LJ-SR:rest-rest 0 48 Also the other question is why are the the contributions of some of the pairs to the Coulom-SR not zeros? Can we not completely eliminate the Coulomb-SR energies? Below is a copy of the mdp file used. dt = 0.002 nsteps = 50 nstcomm = 100 comm-grps = System nstlist = 10 ns_type = grid nstxout = 0 nstvout = 0 nstfout = 0 nstlog = 2500 nstenergy = 2500 nstxout-compressed = 2500 compressed-x-grps = System cutoff-scheme = Verlet verlet-buffer-tolerance = -1 pbc = xyz rlist = 1.0 coulombtype = Cut-off rcoulomb= 1.0 rvdw= 1.0 vdwtype = Cut-off constraints = h-bonds constraint_algorithm= lincs freezegrps = GRO freezedim = Y Y Y energygrps = O1 O2 H1 Thanks in advance for your help, -- Siva Dasetty -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Using Gpus on multiple nodes. (Feature #1591)
Thank you Mark for the reply, We use pbs for submitting jobs on our cluster and this is how I request the nodes and processors #PBS -l select=2:ncpus=8:mem=8gb:mpiprocs=8:ngpus=2:gpu_model=k20:interconnect=fdr Do you think the problem could be with the way I installed mdrun using Open MPI? Can you please suggest the missing environmental settings that I may need to include in the job script in order for the MPI to consider 2 ranks on one node? Thank you for your time. On Tue, Oct 14, 2014 at 5:20 PM, Mark Abraham mark.j.abra...@gmail.com wrote: On Tue, Oct 14, 2014 at 10:51 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Dear All, I am currently able to run simulation on a single node containing 2 gpus, but I get the following fatal error when I try to run the simulation using multiple gpus (2 on each node) on multiple nodes (2 for example) using OPEN MPI. Here you say you want 2 ranks on each of two nodes... Fatal error: Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node. mdrun was started with 4 PP MPI processes per node, ... but here mdrun means what it says... but you provided only 2 GPUs. The command I used to run the simulation is mpirun -np 4 mdrun -s tpr file -deffnm ... -gpu_id 01 ... which means your MPI environment (hostfile, job script settings, whatever) doesn't have the settings you think it does, since it's putting all 4 ranks on one node. Mark However It at least runs if I use the following command, mpirun -np 4 mdrun -s tpr file -deffnm ... -gpu_id 0011 But after referring to the following thread, I highly doubt if I am using all the 4 gpus available in the 2 nodes combined. https://mailman-1.sys.kth.se/pipermail/gromacs.org_gmx-developers/2014-May/007682.html Thank you for your help in advance, -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Gromacs Version 5.0.2 - Bug #1603
Dear All, I am using gpu enabled gromacs version 5.0.2 and I am checking if my simulation is still affected by bug #1603 http://redmine.gromacs.org/issues/1603. Below is the PP-PME load balancing part of my log file PP/PME load balancing changed the cut-off and PME settings: particle-particlePME rcoulomb rlistgrid spacing 1/beta initial 1.000 nm 1.092 nm 108 108 120 0.119 nm 0.320 nm final1.482 nm 1.574 nm 72 72 80 0.178 nm 0.475 nm cost-ratio 3.00 0.30 (note that these numbers concern only part of the total PP and PME load) The release notes says, If, for your simulation, the final rcoulomb value (1.368 here) is different from the initial one (1.000 here), then so was the LJ cutoff for short-ranged interactions, and the model physics was not what you asked for. Does that mean the initial and final values are supposed to be the same? Thanks in advance for your help. -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Gromacs Version 5.0.2 - Bug #1603
Thank You Mark for the reply. Now that clears my confusion about the values in the PP-PME load part of the log file. However, I am still a little bit confused about how to distinguish between a normal-but-not-guranteed change and the normal change, though I understood that for a normal change we can compare the densities obtained from GPU runs and the CPU only runs to verify the results. Also, I have a general question about using vdwtype as PME. Is it advisable to use vdwtype as PME when I am using GPU runs? Thanks. On Wed, Oct 8, 2014 at 12:42 PM, Mark Abraham mark.j.abra...@gmail.com wrote: On Wed, Oct 8, 2014 at 4:29 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Dear All, I am using gpu enabled gromacs version 5.0.2 and I am checking if my simulation is still affected by bug #1603 http://redmine.gromacs.org/issues/1603. Below is the PP-PME load balancing part of my log file PP/PME load balancing changed the cut-off and PME settings: particle-particlePME rcoulomb rlistgrid spacing 1/beta initial 1.000 nm 1.092 nm 108 108 120 0.119 nm 0.320 nm final1.482 nm 1.574 nm 72 72 80 0.178 nm 0.475 nm cost-ratio 3.00 0.30 (note that these numbers concern only part of the total PP and PME load) The release notes says, If, for your simulation, the final rcoulomb value (1.368 here) is different from the initial one (1.000 here), then so was the LJ cutoff for short-ranged interactions, and the model physics was not what you asked for. Does that mean the initial and final values are supposed to be the same? No. The point of the tuning is to change the rcoulomb value to maximize performance while maintaining the quality of the electrostatic approximation you chose in the .mdp file. If you were using one of the affected versions (5.0 or 5.0.1), then the normal-but-not-guaranteed change of rcoulomb led to an inappropriate change of rvdw, which is why it is a relevant diagnostic for whether an actual simulation from the versions with wrong code was affected in practice. But the normal change of rcoulomb neither confirms nor denies that the bug is fixed in 5.0.2. You could observe that the NPT density in 5.0.2 agrees with a CPU-only (or 4.6.x GPU) calculation, for example. Mark -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Commands to run simulations using multiple GPU's in version 5.0.1
Thank you again for the reply. ntmpi is for threadMPI but I am using OpenMPI for MPI as I am planning to use multiple nodes. As I have pointed in case 7 of my post that if I use ntmpi, i get a fatal error that says :thread mpi's are requested but gromacs is not compiled with thread MPI. For which my questions are, 1. Isnt threadMPI enabled by default? 2. Are threadMPI and OpenMPI mutually incompatible? In any case if I use mpirun -np 2 instead of ntmpi, I still cannot use ntomp because gromacs now automatically detects environment settings for the OpenMP threads which is equal to the number of hardware threads that is available and this resulted in case 4 (please check above) of my post. Is there any other command similar to the one you posted above that I can use with OpenMPI? Because it looks like threadMPI and OpenMPI are not compatible to me. Thanks, On Wed, Sep 24, 2014 at 10:50 AM, Johnny Lu johnny.lu...@gmail.com wrote: found it. http://www.gromacs.org/Documentation/Acceleration_and_parallelization GPUs are assigned to PP ranks within the same physical node in a sequential order, that is GPU 0 to the (thread-)MPI rank 0, GPU 1 to rank 1. In order to manually specify which GPU(s) to be used by mdrun, the respective device ID(s) can be passed with the -gpu_id XYZ command line option or with the GMX_GPU_ID=XYZ environment variable. Here, XYZ is a sequence of digits representing the numeric ID-s of available GPUs (the numbering starts from 0) . The environment variable is particularly useful when running on multiple compute nodes with different GPU configurations. Taking the above example of 8-core machine with two compatible GPUs, we can manually specify the GPUs and get the same launch configuration as in the above examples by: mdrun -ntmpi 2 -ntomp 4 -gpu_id 01 On Wed, Sep 24, 2014 at 10:49 AM, Johnny Lu johnny.lu...@gmail.com wrote: Actually i am trying to find the answer to the same question now. manual 4.6.7/appendix D/mdrun says -gpu_id string List of GPU device id-s to use, specifies the per-node PP rank to GPU mapping On Tue, Sep 23, 2014 at 11:07 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Thank you Lu for the reply. As I have mentioned in the post, I have already tried those options but it didn't work. Kindly please let me know if you have anymore suggestions. Thank you, On Tue, Sep 23, 2014 at 8:41 PM, Johnny Lu johnny.lu...@gmail.com wrote: Try -nt, -ntmpi, -ntomp, -np (one at a time) ? I forget about what I tried now But I just stop the mdrun, and then read the log file. Also can look for the mdrun page in the offical manual (pdf) and try this page: http://www.gromacs.org/Documentation/Gromacs_Utilities/mdrun?highlight=mdrun On Mon, Sep 22, 2014 at 6:46 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Dear All, I am trying to run NPT simulations using GROMACS version 5.0.1 of a system of size 140k atoms (protein+water systems) with 2 or more GPU's (model=k20); 8 cores (or more); and 1 or more nodes. I am trying to understand how to run simulations using multiple gpus on more than one node. I get the following errors/output when I run the simulation using the following commands:- Note: time-step used = 2 fs and total number of steps = 2 First 4 cases are using single GPU and cases 5-8 are using 2 GPU's. 1. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 8 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 5.8 ns/day 2. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 16 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 4.7 ns/day 3. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 8 -gpu_id 0 Speed- 5.876 ns/day 4. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 16 -gpu_id 0 Fatal Error: Environment variable OMP_NUM_THREADS (8) and the number of threads requested on the command line (16) have different values. Either omit one, or set them both to the same value. Question for 3 and 4 : Do I need to always use OMP_NUM_THREADS or is there a way ntomp overwrites the environment settings? 5. 1 node, 8cpus , 2gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 4.044 ns/day 6. 2 nodes, 8cpus , 2 gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 3.0 ns/day Are the commands that I used for 5 and 6 correct? 7. I also used (1node, 8 cpus, 2 gpus) mdrun -s topol.tpr -ntmpi 2 -ntomp 8 -gpu_id 01 but this time I get a fatal error: thread mpi's are requested but gromacs is not compiled with thread MPI. Question: Isn't thread MPI enabled by default? 8. Finally, I recompiled Gromacs without OpenMP and re-ran case 1 but this time
Re: [gmx-users] Commands to run simulations using multiple GPU's in version 5.0.1
Thank you Lu for the reply. As I have mentioned in the post, I have already tried those options but it didn't work. Kindly please let me know if you have anymore suggestions. Thank you, On Tue, Sep 23, 2014 at 8:41 PM, Johnny Lu johnny.lu...@gmail.com wrote: Try -nt, -ntmpi, -ntomp, -np (one at a time) ? I forget about what I tried now But I just stop the mdrun, and then read the log file. Also can look for the mdrun page in the offical manual (pdf) and try this page: http://www.gromacs.org/Documentation/Gromacs_Utilities/mdrun?highlight=mdrun On Mon, Sep 22, 2014 at 6:46 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Dear All, I am trying to run NPT simulations using GROMACS version 5.0.1 of a system of size 140k atoms (protein+water systems) with 2 or more GPU's (model=k20); 8 cores (or more); and 1 or more nodes. I am trying to understand how to run simulations using multiple gpus on more than one node. I get the following errors/output when I run the simulation using the following commands:- Note: time-step used = 2 fs and total number of steps = 2 First 4 cases are using single GPU and cases 5-8 are using 2 GPU's. 1. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 8 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 5.8 ns/day 2. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 16 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 4.7 ns/day 3. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 8 -gpu_id 0 Speed- 5.876 ns/day 4. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 16 -gpu_id 0 Fatal Error: Environment variable OMP_NUM_THREADS (8) and the number of threads requested on the command line (16) have different values. Either omit one, or set them both to the same value. Question for 3 and 4 : Do I need to always use OMP_NUM_THREADS or is there a way ntomp overwrites the environment settings? 5. 1 node, 8cpus , 2gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 4.044 ns/day 6. 2 nodes, 8cpus , 2 gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 3.0 ns/day Are the commands that I used for 5 and 6 correct? 7. I also used (1node, 8 cpus, 2 gpus) mdrun -s topol.tpr -ntmpi 2 -ntomp 8 -gpu_id 01 but this time I get a fatal error: thread mpi's are requested but gromacs is not compiled with thread MPI. Question: Isn't thread MPI enabled by default? 8. Finally, I recompiled Gromacs without OpenMP and re-ran case 1 but this time there is a fatal error More than 1 OpenMP thread requested, but Gromacs was compiled without OpenMP support. command : mdrun -s topol.tpr (no environment settings) -gpu_id 0 Question: Here again, I assumed thread MPI is enabled by default and I think Gromacs still assumes OpenMp thread settings. Am i doing something wrong here? Thanks in advance for your help -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Commands to run simulations using multiple GPU's in version 5.0.1
Dear All, I am trying to run NPT simulations using GROMACS version 5.0.1 of a system of size 140k atoms (protein+water systems) with 2 or more GPU's (model=k20); 8 cores (or more); and 1 or more nodes. I am trying to understand how to run simulations using multiple gpus on more than one node. I get the following errors/output when I run the simulation using the following commands:- Note: time-step used = 2 fs and total number of steps = 2 First 4 cases are using single GPU and cases 5-8 are using 2 GPU's. 1. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 8 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 5.8 ns/day 2. 1 node, 8 cpus, 1 gpu export OMP_NUM_THREADS = 16 command used- mdrun -s topol.tpr -gpu_id 0 Speed - 4.7 ns/day 3. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 8 -gpu_id 0 Speed- 5.876 ns/day 4. 1 node, 8cpus, 1gpu mdrun -s topol.tpr -ntomp 16 -gpu_id 0 Fatal Error: Environment variable OMP_NUM_THREADS (8) and the number of threads requested on the command line (16) have different values. Either omit one, or set them both to the same value. Question for 3 and 4 : Do I need to always use OMP_NUM_THREADS or is there a way ntomp overwrites the environment settings? 5. 1 node, 8cpus , 2gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 4.044 ns/day 6. 2 nodes, 8cpus , 2 gpus export OMP_NUM_THREADS = 8 mpirun -np 2 mdrun -s topol.tpr -pin on -gpu_id 01 Speed - 3.0 ns/day Are the commands that I used for 5 and 6 correct? 7. I also used (1node, 8 cpus, 2 gpus) mdrun -s topol.tpr -ntmpi 2 -ntomp 8 -gpu_id 01 but this time I get a fatal error: thread mpi's are requested but gromacs is not compiled with thread MPI. Question: Isn't thread MPI enabled by default? 8. Finally, I recompiled Gromacs without OpenMP and re-ran case 1 but this time there is a fatal error More than 1 OpenMP thread requested, but Gromacs was compiled without OpenMP support. command : mdrun -s topol.tpr (no environment settings) -gpu_id 0 Question: Here again, I assumed thread MPI is enabled by default and I think Gromacs still assumes OpenMp thread settings. Am i doing something wrong here? Thanks in advance for your help -- Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] GPU Acceleration in case of Implicit Solvent Simulations
Dear all, Can we use periodic boundary conditions in case of implicit solvent simulations? If so, why? Also, can implicit solvent model in gromacs in any version (till 5.0) be implemented in more than 2 processors or can it at least use GPU acceleration provided by gromacs? I have tried using pbc=no and group cut-off scheme in gpu based gromacs V5.0, but there is a warning which says GPU is disabled because it effectively works only with verlet cut-off scheme and verlet cut-off scheme requires pbc=xyz or xy. Also tried implementing in gromacs version 4.5.5 using openMM after following the installation instructions in the following link, ( http://www.gromacs.org/Documentation/Installation_Instructions_4.5/GROMACS-OpenMM) and here again there is no luck as we dont have the compatible hardware. Thanks, Siva -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Domain Decomposition error with Implicit Solvent
Thank you Mark for the reply. We are not sure about it either as it worked when we started the simulation again using the cpt file and also there was no issue when we did the same simulation using (links algorithm) constraints. Thanks, Siva On Jul 23, 2014, at 4:20 PM, Mark Abraham mark.j.abra...@gmail.com wrote: On Mon, Jul 21, 2014 at 3:48 PM, Siva Dasetty sdas...@g.clemson.edu wrote: Dear All, I am running simulations of BMP2 protein and graphite sheet using implicit solvent model (mdp file is pasted below). The graphite atoms are frozen in the simulation and BMP2 is free to translate. I got an error Step 1786210: The domain decomposition grid has shifted too much in the Z-direction around cell 0 0 0 after 1749.7 ps of the simulation. I then restarted the simulation without changing anything using the cpt file created from the previous (crashed) run and the simulation continues. It has run for over 60 ps now and is continuing to run. This is something we tried based on a previous email on gmxlist from David van der Spoel. We are using gromacs 4.5.5. Any idea what this error may be due to? Could be anything. I have a hundred bucks that says no developer has ever run with frozen groups and implicit solvent. :-) Consider yourself warned! However, you should look at the trajectory as it approaches the failing step to see what the trigger is - e.g. diffusion further away than the sheet is wide, or something. Mark We know that the system is not blowing up since it continues to run with the cpt file. Thanks, Siva Start MDP file dt = 0.001; time step nsteps = 500 ; number of steps ;nstcomm = 10 ; reset c.o.m. motion nstxout = 10 ; write coords nstvout = 10 ; write velocities nstlog = 10 ; print to logfile nstenergy = 10 ; print energies xtc_grps= System nstxtcout = 10 nstlist = 10 ; update pairlist ns_type = grid ; pairlist method pbc = no rlist = 1.5 rcoulomb= 1.5 rvdw= 1.5 implicit-solvent= GBSA sa-algorithm= Ace-approximation gb_algorithm= OBC rgbradii= 1.5 gb-epsilon-solvent = 78.3 Tcoupl = V-rescale ref_t = 300.0 tc-grps = System tau_t = 0.5 gen_vel = yes ; generate init. vel gen_temp= 300 ; init. temp. gen_seed= 372340 ; random seed ;constraints = all-bonds; constraining bonds with H ;constraint_algorithm = lincs refcoord-scaling= all comm_mode = ANGULAR freezegrps = Graphite freezedim = Y Y Y End MDP file -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Domain Decomposition error with Implicit Solvent
Dear All, I am running simulations of BMP2 protein and graphite sheet using implicit solvent model (mdp file is pasted below). The graphite atoms are frozen in the simulation and BMP2 is free to translate. I got an error Step 1786210: The domain decomposition grid has shifted too much in the Z-direction around cell 0 0 0 after 1749.7 ps of the simulation. I then restarted the simulation without changing anything using the cpt file created from the previous (crashed) run and the simulation continues. It has run for over 60 ps now and is continuing to run. This is something we tried based on a previous email on gmxlist from David van der Spoel. We are using gromacs 4.5.5. Any idea what this error may be due to? We know that the system is not blowing up since it continues to run with the cpt file. Thanks, Siva Start MDP file dt = 0.001; time step nsteps = 500 ; number of steps ;nstcomm = 10 ; reset c.o.m. motion nstxout = 10 ; write coords nstvout = 10 ; write velocities nstlog = 10 ; print to logfile nstenergy = 10 ; print energies xtc_grps= System nstxtcout = 10 nstlist = 10 ; update pairlist ns_type = grid ; pairlist method pbc = no rlist = 1.5 rcoulomb= 1.5 rvdw= 1.5 implicit-solvent= GBSA sa-algorithm= Ace-approximation gb_algorithm= OBC rgbradii= 1.5 gb-epsilon-solvent = 78.3 Tcoupl = V-rescale ref_t = 300.0 tc-grps = System tau_t = 0.5 gen_vel = yes ; generate init. vel gen_temp= 300 ; init. temp. gen_seed= 372340 ; random seed ;constraints = all-bonds; constraining bonds with H ;constraint_algorithm = lincs refcoord-scaling= all comm_mode = ANGULAR freezegrps = Graphite freezedim = Y Y Y End MDP file -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Domain Decomposition error with Implicit Solvent
Dear All, I am running simulations of BMP2 protein and graphite sheet using implicit solvent model (mdp file is pasted below). The graphite atoms are frozen in the simulation and BMP2 is free to translate. I got an error Step 1786210: The domain decomposition grid has shifted too much in the Z-direction around cell 0 0 0 after 1749.7 ps of the simulation. I then restarted the simulation without changing anything using the cpt file created from the previous (crashed) run and the simulation continues. It has run for over 60 ps now and is continuing to run. This is something we tried based on a previous email on gmxlist from David van der Spoel. We are using gromacs 4.5.5. Any idea what this error may be due to? We know that the system is not blowing up since it continues to run with the cpt file. Thanks, Siva Start MDP file dt = 0.001; time step nsteps = 500 ; number of steps ;nstcomm = 10 ; reset c.o.m. motion nstxout = 10 ; write coords nstvout = 10 ; write velocities nstlog = 10 ; print to logfile nstenergy = 10 ; print energies xtc_grps= System nstxtcout = 10 nstlist = 10 ; update pairlist ns_type = grid ; pairlist method pbc = no rlist = 1.5 rcoulomb= 1.5 rvdw= 1.5 implicit-solvent= GBSA sa-algorithm= Ace-approximation gb_algorithm= OBC rgbradii= 1.5 gb-epsilon-solvent = 78.3 Tcoupl = V-rescale ref_t = 300.0 tc-grps = System tau_t = 0.5 gen_vel = yes ; generate init. vel gen_temp= 300 ; init. temp. gen_seed= 372340 ; random seed ;constraints = all-bonds; constraining bonds with H ;constraint_algorithm = lincs refcoord-scaling= all comm_mode = ANGULAR freezegrps = Graphite freezedim = Y Y Y End MDP file -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Domain Decomposition error with Implicit Solvent
Dear All, I am running simulations of BMP2 protein and graphite sheet using implicit solvent model (mdp file is pasted below). The graphite atoms are frozen in the simulation and BMP2 is free to translate. I got an error Step 1786210: The domain decomposition grid has shifted too much in the Z-direction around cell 0 0 0 after 1749.7 ps of the simulation. I then restarted the simulation without changing anything using the cpt file created from the previous (crashed) run and the simulation continues. It has run for over 60 ps now and is continuing to run. This is something we tried based on a previous email on gmxlist from David van der Spoel. We are using gromacs 4.5.5. Any idea what this error may be due to? We know that the system is not blowing up since it continues to run with the cpt file. Thanks, Siva Start MDP file dt = 0.001; time step nsteps = 500 ; number of steps ;nstcomm = 10 ; reset c.o.m. motion nstxout = 10 ; write coords nstvout = 10 ; write velocities nstlog = 10 ; print to logfile nstenergy = 10 ; print energies xtc_grps= System nstxtcout = 10 nstlist = 10 ; update pairlist ns_type = grid ; pairlist method pbc = no rlist = 1.5 rcoulomb= 1.5 rvdw= 1.5 implicit-solvent= GBSA sa-algorithm= Ace-approximation gb_algorithm= OBC rgbradii= 1.5 gb-epsilon-solvent = 78.3 Tcoupl = V-rescale ref_t = 300.0 tc-grps = System tau_t = 0.5 gen_vel = yes ; generate init. vel gen_temp= 300 ; init. temp. gen_seed= 372340 ; random seed ;constraints = all-bonds; constraining bonds with H ;constraint_algorithm = lincs refcoord-scaling= all comm_mode = ANGULAR freezegrps = Graphite freezedim = Y Y Y End MDP file -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.