I am running some annealing trials on a Cray XT4. And although the
throughput is impressive, I have severe difficulties with stability of
the code.
For my relatively small system of ~7500 atoms the engine typically crash
after ~500k steps.
I am using the bleeding-edge CVS version: mdrun.c (1.141) (the newest
one after Erik L.'s recent patch of the PME code)
I configure and compile on the compute nodes exclusively (not the
frontend) and the only compiler warning(s) I get are of the type:
"warning: Using 'getpwuid' in statically linked applications requires
at runtime the shared libraries from the glibc version used for linking"
After compile though, the code executes and runs for ~20mins, producing
sound data before stalling.
The error logs are very short and quite uniformative.
PBS .o:
Application 159316 exit codes: 137
Application 159316 exit signals: Killed
Application 159316 resources: utime 0, stime 0
--------------------------------------------------
Begin PBS Epilogue hexagon.bccs.uib.no
Date: Mon Sep 29 12:32:54 CEST 2008
Job ID: 65643.nid00003
Username: bjornss
Group: bjornss
Job Name: pmf_hydanneal_heatup_400K
Session: 10156
Limits: walltime=05:00:00
Resources:
cput=00:00:00,mem=4940kb,vmem=22144kb,walltime=00:20:31
Queue: batch
Account: fysisk
Base login-node: login5
End PBS Epilogue Mon Sep 29 12:32:54 CEST 2008
PBS .err:
_pmii_daemon(SIGCHLD): PE 0 exit signal Killed
[NID 702]Apid 159316: initiated application termination.
As proper electrostatics is crucial to my modeling I am using PME which
comprises a large part of my calculation cost: 35-50%
In the most extreme case, I use the following startup-script
run.pbs:
#!/bin/bash
#PBS -A fysisk
#PBS -N pmf_hydanneal_heatup_400K
#PBS -o pmf_hydanneal.o
#PBS -e pmf.hydanneal.err
#PBS -l walltime=5:00:00,mppwidth=40,mppnppn=4
cd /work/bjornss/pmf/structII/hydrate_annealing/heatup_400K
source $HOME/gmx_latest_290908/bin/GMXRC
aprun -n 40 parmdrun -s topol.tpr -maxh 5 -npme 20
exit $?
Now, apart from a significant reduction in the system dipole moment,
there are no large changes in the system, nor significant translations
of the molecules in the box.
I enclose the md.log and my parameter file. The run-topology (topol.tpr)
can be found at:
http:/drop.io/mdanneal
if anyone wants to try and replicate the crash on their local cluster,
they are welcome.
If after such trials are attempted the error persists, I am willing to
post a bug on bugzilla.
If more information is needed I will try to provide it upon request
Regards and thanks for bothering
--
---------------------
Bjørn Steen Saethre
PhD-student
Theoretical and Energy Physics Unit
Institute of Physics and Technology
Allegt, 41
N-5020 Bergen
Norway
Tel(office) +47 55582869
Log file opened on Mon Sep 29 14:03:14 2008
Host: nid01054 pid: 8315 nodeid: 0 nnodes: 40
The Gromacs distribution was built Mon Sep 29 13:25:26 CEST 2008 by
[EMAIL PROTECTED] (Linux 2.6.16.54-0.2.5-ss x86_64)
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0_rc1 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) parmdrun (-:
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------
parameters of the run:
integrator = md
nsteps = 2000000
init_step = 0
ns_type = Grid
nstlist = 5
ndelta = 2
nstcomm = 1
comm_mode = Linear
nstcheckpoint = 1000
nstlog = 100000
nstxout = 200000
nstvout = 200000
nstfout = 200000
nstenergy = 100
nstxtcout = 1000
init_t = 0
delta_t = 0.001
xtcprec = 1000
nkx = 60
nky = 40
nkz = 40
pme_order = 6
ewald_rtol = 1e-05
ewald_geometry = 0
epsilon_surface = 0
optimize_fft = TRUE
ePBC = xyz
bPeriodicMols = FALSE
bContinuation = FALSE
bShakeSOR = FALSE
etc = Berendsen
epc = No
epctype = Isotropic
tau_p = 1
ref_p (3x3):
ref_p[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
compress (3x3):
compress[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
compress[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
compress[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
refcoord_scaling = No
posres_com (3):
posres_com[0]= 0.00000e+00
posres_com[1]= 0.00000e+00
posres_com[2]= 0.00000e+00
posres_comB (3):
posres_comB[0]= 0.00000e+00
posres_comB[1]= 0.00000e+00
posres_comB[2]= 0.00000e+00
andersen_seed = 815131
rlist = 0.9
rtpi = 0.05
coulombtype = PME
rcoulomb_switch = 0
rcoulomb = 0.9
vdwtype = Cut-off
rvdw_switch = 0
rvdw = 0.9
epsilon_r = 1
epsilon_rf = 1
tabext = 1
implicit_solvent = No
gb_algorithm = Still
gb_epsilon_solvent = 80
nstgbradii = 1
rgbradii = 2
gb_saltconc = 0
gb_obc_alpha = 1
gb_obc_beta = 0.8
gb_obc_gamma = 4.85
sa_surface_tension = 2.092
DispCorr = Ener
free_energy = no
init_lambda = 0
sc_alpha = 0
sc_power = 0
sc_sigma = 0.3
delta_lambda = 0
nwall = 0
wall_type = 9-3
wall_atomtype[0] = -1
wall_atomtype[1] = -1
wall_density[0] = 0
wall_density[1] = 0
wall_ewald_zfac = 3
pull = no
disre = No
disre_weighting = Conservative
disre_mixed = FALSE
dr_fc = 1000
dr_tau = 0
nstdisreout = 100
orires_fc = 0
orires_tau = 0
nstorireout = 100
dihre-fc = 1000
em_stepsize = 0.01
em_tol = 10
niter = 20
fc_stepsize = 0
nstcgsteep = 1000
nbfgscorr = 10
ConstAlg = Lincs
shake_tol = 1e-04
lincs_order = 6
lincs_warnangle = 30
lincs_iter = 2
bd_fric = 0
ld_seed = 1993
cos_accel = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 12957
ref_t: 400
tau_t: 0.5
anneal: No
ann_npoints: 0
acc: 0 0 0
nfreeze: N N N
energygrp_flags[ 0]: 0
efield-x:
n = 0
efield-xt:
n = 0
efield-y:
n = 0
efield-yt:
n = 0
efield-z:
n = 0
efield-zt:
n = 0
bQMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
scalefactor = 1
qm_opts:
ngQM = 0
Initializing Domain Decomposition on 40 nodes
Dynamic load balancing: auto
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
two-body bonded interactions: 0.377 nm
multi-body bonded interactions: 0.377 nm
Minimum cell size due to bonded interactions: 0.414 nm
Using 20 separate PME nodes
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 20 cells with a minimum initial size of 0.518 nm
The maximum allowed number of cells is: X 12 Y 8 Z 8
Domain decomposition grid 5 x 4 x 1, separate PME nodes 20
Interleaving PP and PME nodes
This is a particle-particle only node
Domain decomposition nodeid 0, coordinates 0 0 0
Using two step summing over 10 groups of on average 2.0 processes
Table routines are used for coulomb: TRUE
Table routines are used for vdw: FALSE
Will do PME sum in reciprocal space.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
Cut-off's: NS: 0.9 Coulomb: 0.9 LJ: 0.9
System total charge: -0.000
Generated table with 950 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 950 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 950 data points for LJ12.
Tabscale = 500 points/nm
Enabling TIP4p water optimization for 1632 molecules.
Configuring nonbonded kernels...
Testing x86_64 SSE support... present.
Removing pbc first time
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------
Linking all bonded interactions to atoms
There are 3744 inter charge-group exclusions,
will use an extra communication step for exclusion forces for PME
The initial number of communication pulses is: X 1 Y 1
The initial domain decomposition cell size is: X 1.24 nm Y 1.04 nm
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 0.900 nm
two-body bonded interactions (-rdd) 0.900 nm
multi-body bonded interactions (-rdd) 0.900 nm
When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: X 2 Y 2
The minimum size for domain decomposition cells is 0.707 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: X 0.57 Y 0.68
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 0.900 nm
two-body bonded interactions (-rdd) 0.900 nm
multi-body bonded interactions (-rdd) 0.707 nm
Making 2D domain decomposition grid 5 x 4 x 1, home cell index 0 0 0
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
Molecular dynamics with coupling to an external bath
J. Chem. Phys. 81 (1984) pp. 3684-3690
-------- -------- --- Thank You --- -------- --------
title = heatup 400K structII - propan - tip4p/ice(rigid) -
PME
cpp = /lib/cpp
integrator = md
define =-DPOSRES
include = -I/home/fi/bjornss/mytop
;Run ctrl
dt = 0.001
nsteps = 2000000
nstxout = 200000
nstvout = 200000
nstfout = 200000
nstenergy = 100
nstlog = 100000
nstxtcout = 1000
;Electrostatics/Neigboursearch
nstlist = 5
ns_type = grid
rlist = 0.9
coulombtype = PME
ewald_geometry = 3d
rcoulomb = 0.9
vdw-type = Cut-off
rvdw = 0.9
optimize_fft = yes
fourier_nx = 60
fourier_ny = 40
fourier_nz = 40
pme_order = 6
;Boundary conditions/constraints etc,
pbc = xyz
DispCorr = Ener
constraints = hbonds
constraint_algorithm = lincs
lincs_iter = 2
lincs_order = 6
;nwall = 0
;walltype = 9-3
;wall_r_linpot = -10
;wall_atomtype = opls_113 opls_113
;wall_density = 4.6 4.6
;wall_ewald_zfac = 2.4
;Temperature and pressure generation and coupling
gen_vel = no
;gen_temp = 350
;gen_seed = -1
tcoupl = berendsen
tc_grps = System
tau_t = 0.5
ref_t = 400
pcoupl = no
;pcoupltype = isotropic
;tau_p = 2
;ref_p = 10
;compressibility = 5e-6
unconstrained-start = no
_______________________________________________
gmx-users mailing list [email protected]
http://www.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php