Apologies: my.tpr and my_prev.tpr should have read my.cpt and my_prev.cpt.

On 11-05-05 12:36 PM, Chris Neale wrote:
Dear Users:

Using gromacs 4.0.5, I find that there are at least some cases where some type of disk error can get propagated through both my.tpr and my_prev.tpr, complicating restarts. This used to be a bigger problem in gromacs 3, and I don't recall ever seeing it in gromacs 4 so I thought I would post a notification.

I'm just going to extract some coordinates and restart, but ideally this wouldn't happen. A google search for the relevant error "Count mismatch for state entry" only turns up some online source code.

I don't know if this error occurs in 4.5.3, and it's not binary reproducible so that would be difficult to check. Still, the error checking that regularly occurs prior to overwriting the previous (and without error) _prev.cpt file with a new (and with error) _prev.cpt file seemed to not catch this problem, at least with gromacs 4.0.5.

The run that wrote out the .tpr finished normally due to -maxh, with a stderr that looked like this:

... < snip > ...
starting mdrun 'Generated by genbox'
10000000 steps,  20000.0 ps (continuing from step 3769350,   7538.7 ps).
[gpc-f138n034:06165] 15 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics [gpc-f138n034:06165] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Step 5036590: Run time exceeded 47.322 hours, will terminate the run

Step 5036600: Run time exceeded 47.322 hours, will terminate the run

 Average load imbalance: 0.2 %
 Part of the total run time spent waiting due to load imbalance: 0.2 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Z 0 %
 Average PME mesh/force load: 0.745
 Part of the total run time spent waiting due to PP/PME imbalance: 4.9 %


        Parallel run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time: 170485.000 170485.000    100.0
                       1d23h21:25
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:    625.583     31.889      1.284     18.685

gcq#165: "I'm a Jerk" (F. Black)


gcq#165: "I'm a Jerk" (F. Black)

#############################################

And then when I gmxcheck both of the .cpt files I get the exact same error, although the files do differ:

$ diff md1.cpt md1_prev.cpt
Binary files md1.cpt and md1_prev.cpt differ


$ gmxcheck  -f md1.cpt
                         :-)  G  R  O  M  A  C  S  (-:

                              S  C  A  M  O  R  G

                            :-)  VERSION 4.0.5  (-:


Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                               :-)  gmxcheck  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -f        md1.cpt  Input, Opt!  Trajectory: xtc trr trj gro g96 pdb cpt
 -f2       traj.xtc  Input, Opt.  Trajectory: xtc trr trj gro g96 pdb cpt
 -s1       top1.tpr  Input, Opt.  Run input file: tpr tpb tpa
 -s2       top2.tpr  Input, Opt.  Run input file: tpr tpb tpa
-c topol.tpr Input, Opt. Structure+mass(db): tpr tpb tpa gro g96 pdb
  -e       ener.edr  Input, Opt.  Energy file: edr ene
 -e2      ener2.edr  Input, Opt.  Energy file: edr ene
  -n      index.ndx  Input, Opt.  Index file
  -m        doc.tex  Output, Opt. LaTeX file

Option       Type   Value   Description
------------------------------------------------------
-[no]h       bool   no      Print help info and quit
-nice        int    0       Set the nicelevel
-vdwfac      real   0.8     Fraction of sum of VdW radii used as warning
                            cutoff
-bonlo real 0.4 Min. fract. of sum of VdW radii for bonded atoms -bonhi real 0.7 Max. fract. of sum of VdW radii for bonded atoms
-tol         real   0.001   Relative tolerance for comparing real values
                            defined as 2*(a-b)/(|a|+|b|)
-[no]ab      bool   no      Compare the A and B topology from one file
-lastener string Last energy term to compare (if not given all are
                            tested). It makes sense to go up until the
                            Pressure.

Checking file md1.cpt

-------------------------------------------------------
Program gmxcheck, VERSION 4.0.5
Source code file: checkpoint.c, line: 186

Fatal error:
Count mismatch for state entry SDx, code count is 754728, file count is 0

-------------------------------------------------------

"Confirmed" (Star Trek)

############################ and the same thing for the _prev.cpt file:

# gmxcheck  -f md1_prev.cpt
                         :-)  G  R  O  M  A  C  S  (-:

                       GRowing Old MAkes el Chrono Sweat

                            :-)  VERSION 4.0.5  (-:


Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                               :-)  gmxcheck  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -f   md1_prev.cpt  Input, Opt!  Trajectory: xtc trr trj gro g96 pdb cpt
 -f2       traj.xtc  Input, Opt.  Trajectory: xtc trr trj gro g96 pdb cpt
 -s1       top1.tpr  Input, Opt.  Run input file: tpr tpb tpa
 -s2       top2.tpr  Input, Opt.  Run input file: tpr tpb tpa
-c topol.tpr Input, Opt. Structure+mass(db): tpr tpb tpa gro g96 pdb
  -e       ener.edr  Input, Opt.  Energy file: edr ene
 -e2      ener2.edr  Input, Opt.  Energy file: edr ene
  -n      index.ndx  Input, Opt.  Index file
  -m        doc.tex  Output, Opt. LaTeX file

Option       Type   Value   Description
------------------------------------------------------
-[no]h       bool   no      Print help info and quit
-nice        int    0       Set the nicelevel
-vdwfac      real   0.8     Fraction of sum of VdW radii used as warning
                            cutoff
-bonlo real 0.4 Min. fract. of sum of VdW radii for bonded atoms -bonhi real 0.7 Max. fract. of sum of VdW radii for bonded atoms
-tol         real   0.001   Relative tolerance for comparing real values
                            defined as 2*(a-b)/(|a|+|b|)
-[no]ab      bool   no      Compare the A and B topology from one file
-lastener string Last energy term to compare (if not given all are
                            tested). It makes sense to go up until the
                            Pressure.

Checking file md1_prev.cpt

-------------------------------------------------------
Program gmxcheck, VERSION 4.0.5
Source code file: checkpoint.c, line: 186

Fatal error:
Count mismatch for state entry SDx, code count is 754728, file count is 0

-------------------------------------------------------

"I'm Only Faking When I Get It Right" (Soundgarden)


--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to