On Wed, Feb 5, 2014 at 9:43 AM, Mark Abraham <mark.j.abra...@gmail.com>wrote:
> What's the network? If it's some kind of switched Infiniband shared with > other user's jobs, then getting hit by the traffic does happen. You can see > It indeed use an InfiniBand 4X QDR (Quad Data Rate) 40 Gbit/s switched fabric, with a two to one blocking factor. And I tried running this again with GPU version, which illustrated the same issue: every single run gets a different coulomb cutoff after automatic optimization. Since it is unlikely to have my own corner on a nation-wide supercomputer, is there any parameters that could avoid this from happening? Turning off load balancing sounds crazy. > that the individual timings of the things the load balancer tries differ a > lot between runs. So there must be an extrinsic factor (if the .tpr is > functionally the same). Organizing yourself a quiet corner of the network > is ideal, if you can do the required social engineering :-P > > Mark > > > On Wed, Feb 5, 2014 at 6:22 PM, yunshi11 . <yunsh...@gmail.com> wrote: > > > Hello all, > > > > I am doing a production MD run of a protein-ligand complex in explicit > > water with GROMACS4.6.5 > > > > However, I got different coulomb cutoff values as shown in the output log > > files. > > > > 1st one: > > > > > ................................................................................................................................... > > NOTE: Turning on dynamic load balancing > > > > step 60: timed with pme grid 112 112 112, coulomb cutoff 1.000: 235.9 > > M-cycles > > step 100: timed with pme grid 100 100 100, coulomb cutoff 1.116: 228.8 > > M-cycles > > step 100: the domain decompostion limits the PME load balancing to a > > coulomb cut-off of 1.162 > > step 140: timed with pme grid 112 112 112, coulomb cutoff 1.000: 223.9 > > M-cycles > > step 180: timed with pme grid 108 108 108, coulomb cutoff 1.033: 219.2 > > M-cycles > > step 220: timed with pme grid 104 104 104, coulomb cutoff 1.073: 210.9 > > M-cycles > > step 260: timed with pme grid 100 100 100, coulomb cutoff 1.116: 229.0 > > M-cycles > > step 300: timed with pme grid 96 96 96, coulomb cutoff 1.162: 267.8 > > M-cycles > > step 340: timed with pme grid 112 112 112, coulomb cutoff 1.000: 241.4 > > M-cycles > > step 380: timed with pme grid 108 108 108, coulomb cutoff 1.033: 424.1 > > M-cycles > > step 420: timed with pme grid 104 104 104, coulomb cutoff 1.073: 215.1 > > M-cycles > > step 460: timed with pme grid 100 100 100, coulomb cutoff 1.116: 226.4 > > M-cycles > > optimal pme grid 104 104 104, coulomb cutoff 1.073 > > DD step 24999 vol min/aver 0.834 load imb.: force 2.3% pme > mesh/force > > 0.687 > > > > > ................................................................................................................................... > > > > > > 2nd one: > > NOTE: Turning on dynamic load balancing > > > > step 60: timed with pme grid 112 112 112, coulomb cutoff 1.000: 187.1 > > M-cycles > > step 100: timed with pme grid 100 100 100, coulomb cutoff 1.116: 218.3 > > M-cycles > > step 140: timed with pme grid 112 112 112, coulomb cutoff 1.000: 172.4 > > M-cycles > > step 180: timed with pme grid 108 108 108, coulomb cutoff 1.033: 188.3 > > M-cycles > > step 220: timed with pme grid 104 104 104, coulomb cutoff 1.073: 203.1 > > M-cycles > > step 260: timed with pme grid 112 112 112, coulomb cutoff 1.000: 174.3 > > M-cycles > > step 300: timed with pme grid 108 108 108, coulomb cutoff 1.033: 184.4 > > M-cycles > > step 340: timed with pme grid 104 104 104, coulomb cutoff 1.073: 205.4 > > M-cycles > > step 380: timed with pme grid 112 112 112, coulomb cutoff 1.000: 172.1 > > M-cycles > > step 420: timed with pme grid 108 108 108, coulomb cutoff 1.033: 188.8 > > M-cycles > > optimal pme grid 112 112 112, coulomb cutoff 1.000 > > DD step 24999 vol min/aver 0.789 load imb.: force 4.7% pme > mesh/force > > 0.766 > > > > > ................................................................................................................................... > > > > > > > > > > The 2nd MD run turned out to be much faster (5 times), and the reason I > > submitted the 2nd is because the 1st was unexpectedly slow. > > > > I made sure the .tpr file and .pbs file (MPI for a cluster, which > consists > > of Xeon E5649 CPUs) are virtually identical, and here is my .mdp file: > > ; > > title = Production Simulation > > cpp = /lib/cpp > > > > ; RUN CONTROL PARAMETERS > > integrator = md > > tinit = 0 ; Starting time > > dt = 0.002 ; 2 femtosecond time step for > > integration > > nsteps = 500000000 ; 1000 ns = 0.002ps * 50,000,000 > > > > ; OUTPUT CONTROL OPTIONS > > nstxout = 25000 ; .trr full precision coor every > > 50ps > > nstvout = 0 ; .trr velocities output > > nstfout = 0 ; Not writing forces > > nstlog = 25000 ; Writing to the log file every > 50ps > > nstenergy = 25000 ; Writing out energy information > > every 50ps > > energygrps = dikpgdu Water_and_ions > > > > ; NEIGHBORSEARCHING PARAMETERS > > cutoff-scheme = Verlet > > nstlist = 20 > > ns-type = Grid > > pbc = xyz ; 3-D PBC > > rlist = 1.0 > > > > ; OPTIONS FOR ELECTROSTATICS AND VDW > > rcoulomb = 1.0 ; short-range electrostatic cutoff > (in > > nm) > > coulombtype = PME ; Particle Mesh Ewald for long-range > > electrostatics > > pme_order = 4 ; interpolation > > fourierspacing = 0.12 ; grid spacing for FFT > > vdw-type = Cut-off > > rvdw = 1.0 ; short-range van der Waals cutoff > (in > > nm) > > optimize_fft = yes ; > > > > ; Temperature coupling > > Tcoupl = v-rescale > > tc-grps = dikpgdu Water_and_ions > > tau_t = 0.1 0.1 > > ref_t = 298 298 > > > > ; Pressure coupling > > Pcoupl = Berendsen > > Pcoupltype = Isotropic > > tau_p = 1.0 > > compressibility = 4.5e-5 > > ref_p = 1.0 > > > > ; Dispersion correction > > DispCorr = EnerPres ; account for cut-off vdW scheme > > > > ; GENERATE VELOCITIES FOR STARTUP RUN > > gen_vel = no > > > > ; OPTIONS FOR BONDS > > continuation = yes > > constraints = hbonds > > constraint-algorithm = Lincs > > lincs-order = 4 > > lincs-iter = 1 > > lincs-warnangle = 30 > > > > > > > > I am surprised that the coulomb cutoffs of 1.073 vs 1.000 could cause > > 5-fold performance difference, and why would they be different in the > first > > place if identical input files were used? > > > > I haven't found anything peculiar on the cluster I am using. > > > > Any suggestions for the issue? > > > > Thanks, > > Yun > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > send a mail to gmx-users-requ...@gromacs.org. > > > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.