On October 1, 2010 at 2:37 AM Roland Schulz <[email protected]> wrote:
After how many steps do you get the error?
Do you get the error also with -dlb yes and the environment variable GMX_DLB_FLOP set to 1 or 2?
Thank you for your interest but I think I found the issue. I am using a cluster I built with many different computers. All have 3.04 GHz pentium 4 processors but the other components vary. I have since recompiled GROMACS 4.5.1 on each computer separately (albeit a tedious task but well worth it). Now I am running smooth and have become a preacher of this individual gromacs compiling.
Below is what occurred before the "fix"
The error would occur at random number of stems but never above 5000. It was very intermittent and I believe this to be the jobs "node" selection. If a job ran on one node it would work and on another it would not. I came to the conclusion above by running the jobs that failed on my workstation which of course had gromacs compiled on it rather than on another machine. I just hope that this post will save the next guy some heartache and precious time.
Thank you again,
TJ Mustard
Roland
On Tue, Sep 28, 2010 at 12:54 PM, TJ Mustard <[email protected]> wrote:
Found that if I set the setting of "-dlb no" when running the mdrun it would not fail. How could dynamic load balancing do this?
TJ Mustard
On September 28, 2010 at 3:21 AM TJ Mustard <[email protected]> wrote:
Hey all,
Here is another error that I keep getting. I am trying to "speed up" my md runs with -heavyh and longer time steps. I don't get LINCS errors but I do get this...
Back Off! I just backed up prlog.log to ./#prlog.log.1#
Getting Loaded...
Reading file monomer_pr.tpr, VERSION 4.5.1 (single precision)
Starting 2 threads
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Back Off! I just backed up pr.edr to ./#pr.edr.1#
starting mdrun 'Protein in water'
25000 steps, 100.0 ps.
step 900, will finish Mon Sep 27 18:19:07 2010imb F 18%
NOTE: Turning on dynamic load balancing
step 1600, will finish Mon Sep 27 18:18:42 2010vol 0.92 imb F 1%
A list of missing interactions:
LJC Pairs NB of 278 missing 1
exclusions of 6966 missing 1
-------------------------------------------------------
Program g4.5.1-mdrun, VERSION 4.5.1
Source code file: domdec_top.c, line: 173
Software inconsistency error:
Some interactions seem to be assigned multiple times
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Some settings:
define = -DPOSRES
dt = 0.004
nsteps = 25000
; OPTIONS FOR ELECTROSTATICS AND VDW
; Method for doing electrostatics
coulombtype = PME ; = Cutoff
rcoulomb-switch = 0 ; = 0
rcoulomb = 0.9 ; = 1
; Relative dielectric constant for the medium and the reaction field
epsilon_r = 1 ; = 1
epsilon_rf = 1 ; = 1
; Method for doing Van der Waals
vdw-type = Cut-off ; = Cut-off
; cut-off lengths
rvdw-switch = 0 ; = 0
rvdw = 1 ; = 1
; Apply long range dispersion corrections for Energy and Pressure
DispCorr = No ; = No
; Extension of the potential lookup tables beyond the cut-off
table-extension = 1 ; = 1
; Seperate tables between energy group pairs
energygrp_table = ; =
; Spacing for the PME/PPPM FFT grid
fourierspacing = 0.12 ; = 0.12
; FFT grid size, when a value is 0 fourierspacing will be used
fourier_nx = 0 ; = 0
fourier_ny = 0 ; = 0
fourier_nz = 0 ; = 0
; EWALD/PME/PPPM parameters
pme_order = 6 ; = 4
ewald_rtol = 1e-05 ; = 1e-05
ewald_geometry = 3d ; = 3d
epsilon_surface = 0 ; = 0
optimize_fft = yes ; = no
; OPTIONS FOR WEAK COUPLING ALGORITHMS
; Temperature coupling
tcoupl = v-rescale ; = No
nsttcouple = -1 ; = -1
nh-chain-length = 10 ; = 10
; Groups to couple separately
tc-grps = RNA SOL ; =
; Time constant (ps) and reference temperature (K)
tau-t = 0.1 0.1 ; =
ref-t = 300 300 ; =
; Pressure coupling
Pcoupl = Parrinello-Rahman ; = No
Pcoupltype = Isotropic
nstpcouple = -1 ; = -1
; Time constant (ps), compressibility (1/bar) and reference P (bar)
tau-p = 1 ; = 1
compressibility = 4.5e-5 ; =
ref-p = 1.0 ; =
; Scaling of reference coordinates, No, All or COM
refcoord_scaling = No ; = No
; Random seed for Andersen thermostat
andersen_seed = 815131 ; = 815131
gen-vel = yes ; = no
gen-temp = 300 ; = 300
gen-seed = 173529 ; = 173529
constraints = all-bonds
Any help would be appreciated. It also seems to be intermittent as I have 21 identical runs (with different lambda values) and some work and some don't. It also changes every time I run them.
Thank you,
TJ Mustard
Email: [email protected]
TJ Mustard
Email: [email protected]
--
gmx-users mailing list [email protected]
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [email protected].
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
TJ Mustard
Email: [email protected]
-- gmx-users mailing list [email protected] http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to [email protected]. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

