Re: [gmx-users] Long trajectory split

2014-02-27 Thread Marcelo Depólo
Dear Dr,

Which details or files do you need? I would be very happy to solve this
question by posting any kind of files that you request.



2014-02-23 22:21 GMT+01:00 Dr. Vitaly Chaban vvcha...@gmail.com:

 You do not provide all the details. As was pointed at the very
 beginning, most likely you have incorrect parallelism in this case.
 Can you post all the files you obtain for people to inspect?


 Dr. Vitaly V. Chaban


 On Sun, Feb 23, 2014 at 9:04 PM, Marcelo Depólo marcelodep...@gmail.com
 wrote:
   Justin, as far as I realized, the next log file starts at 0ps what would
  mean that it is re-starting for some reason. At first, I imagined that it
  was only splitting the data among files due to some kind of size limit,
 as
  you said, but when I tried to concatenate the trajectories, it gives me a
  non-sense output, with a lot of 'beginnings'.
 
  I will check with the cluster experts if there is some kind of size
  limit.It seems to be the most logical source of the problem to me.
 
  Mark, the only difference this time is the time-scale set since the
  beginning. Apart from the protein itself, even the .mdp files were copied
  from a sucessful folder.
 
  But thank you both for the support.
 
 
  2014-02-23 20:20 GMT+01:00 Mark Abraham mark.j.abra...@gmail.com:
 
  On Sun, Feb 23, 2014 at 6:48 PM, Marcelo Depólo 
 marcelodep...@gmail.com
  wrote:
 
   Justin, the other runs with the very same binary do not produce the
 same
   problem.
  
   Mark, I just omitted the _mpi of the line here, but is was compiled as
   _mpi.
  
 
  OK, that rules that problem out, but please don't simplify and
 approximate.
  Computers are exact, and trouble shooting problems with them requires
 all
  the information. If we all understood perfectly we wouldn't be having
  problems ;-)
 
  Those files do get closed at checkpoint intervals, so they can be hashed
  for the hash value to be saved in the checkpoint. It is conceivable some
  file system would not close-and-re-open them properly. The .log files
 would
  comment about at least some such conditions.
 
  But the real question is what you are doing differently from the times
 when
  you have observed normal behaviour!
 
  Mark
 
 
   My log file top:
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   *Gromacs version:VERSION 4.6.1Precision:  singleMemory
   model:   64 bitMPI library:MPIOpenMP support:
 disabledGPU
   support:disabledinvsqrt routine:gmx_software_invsqrt(x)CPU
   acceleration:   SSE4.1FFT library:fftw-3.3.2-sse2Large file
   support: enabledRDTSCP usage:   enabledBuilt on:   Sex
 Nov 29
   16:08:45 BRST 2013Built by:   root@jupiter [CMAKE]Build
   OS/arch:  Linux 2.6.32.13-0.4-default x86_64Build CPU vendor:
   GenuineIntelBuild CPU brand:Intel(R) Xeon(R) CPU   X5650
  @
   2.67GHzBuild CPU family:   6   Model: 44   Stepping: 2Build CPU
 features:
   apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pdcm
  pdpe1gb
   popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3(...)*
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   *Initializing Domain Decomposition on 24 nodesDynamic load balancing:
   autoWill sort the charge groups at every domain
 (re)decompositionInitial
   maximum inter charge-group distances:two-body bonded interactions:
   0.621 nm, LJ-14, atoms 3801 3812  multi-body bonded interactions:
 0.621
  nm,
   G96Angle, atoms 3802 3812Minimum cell size due to bonded interactions:
   0.683 nmMaximum distance for 5 constraints, at 120 deg. angles,
  all-trans:
   0.820 nmEstimated maximum distance required for P-LINCS: 0.820 nmThis
   distance will limit the DD cell size, you can override this with
  -rconGuess
   for relative PME load: 0.26Will use 18 particle-particle and 6 PME
 only
   nodesThis is a guess, check the performance at the end of the log
  fileUsing
   6 separate PME nodesScaling the initial minimum size with 1/0.8
 (option
   -dds) = 1.25Optimizing the DD grid for 18 cells with a minimum initial
  size
   of 1.025 nmThe maximum allowed number of cells is: X 8 Y 8 Z 8Domain
   decomposition grid 3 x 2 x 3, separate PME nodes 6PME domain
  decomposition:
   3 x 2 x 1Interleaving PP and PME nodesThis is a particle-particle only
   nodeDomain decomposition nodeid 0, coordinates 0 0 0*
  
  
  
   2014-02-23 18:08 GMT+01:00 Justin Lemkul jalem...@vt.edu:
  
   
   
On 2/23/14, 11:32 AM, Marcelo Depólo wrote:
   
Maybe I should explain it better.
   
I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o
 prt.trr*,
pretty
   
much a standard line. This job in a batch creates the outputs and,
  after
some (random) time, a back up is done and new files are written,
 but
  the
job itself do not finish.
   
   
It would help if you can post the .log file from one of the runs to
 see
the information regarding mdrun's parallel capabilities.  This still
   sounds
like a case 

Re: [gmx-users] Long trajectory split

2014-02-27 Thread Dr. Vitaly Chaban
The only real way to troubleshot this kind of problems is that someone
here starts your system at his local PC and sees the problem by the
own eyes.

As no one confirmed the same issue as yours, it is most likely that
the cause of problem lies outside gromacs code. Either something is
wrong with your operational environment or you interpret your
observations incorrectly.


Dr. Vitaly V. Chaban


On Thu, Feb 27, 2014 at 1:33 PM, Marcelo Depólo marcelodep...@gmail.com wrote:
 Dear Dr,

 Which details or files do you need? I would be very happy to solve this
 question by posting any kind of files that you request.



 2014-02-23 22:21 GMT+01:00 Dr. Vitaly Chaban vvcha...@gmail.com:

 You do not provide all the details. As was pointed at the very
 beginning, most likely you have incorrect parallelism in this case.
 Can you post all the files you obtain for people to inspect?


 Dr. Vitaly V. Chaban


 On Sun, Feb 23, 2014 at 9:04 PM, Marcelo Depólo marcelodep...@gmail.com
 wrote:
   Justin, as far as I realized, the next log file starts at 0ps what
  would
  mean that it is re-starting for some reason. At first, I imagined that
  it
  was only splitting the data among files due to some kind of size limit,
  as
  you said, but when I tried to concatenate the trajectories, it gives me
  a
  non-sense output, with a lot of 'beginnings'.
 
  I will check with the cluster experts if there is some kind of size
  limit.It seems to be the most logical source of the problem to me.
 
  Mark, the only difference this time is the time-scale set since the
  beginning. Apart from the protein itself, even the .mdp files were
  copied
  from a sucessful folder.
 
  But thank you both for the support.
 
 
  2014-02-23 20:20 GMT+01:00 Mark Abraham mark.j.abra...@gmail.com:
 
  On Sun, Feb 23, 2014 at 6:48 PM, Marcelo Depólo
  marcelodep...@gmail.com
  wrote:
 
   Justin, the other runs with the very same binary do not produce the
   same
   problem.
  
   Mark, I just omitted the _mpi of the line here, but is was compiled
   as
   _mpi.
  
 
  OK, that rules that problem out, but please don't simplify and
  approximate.
  Computers are exact, and trouble shooting problems with them requires
  all
  the information. If we all understood perfectly we wouldn't be having
  problems ;-)
 
  Those files do get closed at checkpoint intervals, so they can be
  hashed
  for the hash value to be saved in the checkpoint. It is conceivable
  some
  file system would not close-and-re-open them properly. The .log files
  would
  comment about at least some such conditions.
 
  But the real question is what you are doing differently from the times
  when
  you have observed normal behaviour!
 
  Mark
 
 
   My log file top:
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   *Gromacs version:VERSION 4.6.1Precision:  singleMemory
   model:   64 bitMPI library:MPIOpenMP support:
   disabledGPU
   support:disabledinvsqrt routine:
   gmx_software_invsqrt(x)CPU
   acceleration:   SSE4.1FFT library:fftw-3.3.2-sse2Large file
   support: enabledRDTSCP usage:   enabledBuilt on:   Sex
   Nov 29
   16:08:45 BRST 2013Built by:   root@jupiter [CMAKE]Build
   OS/arch:  Linux 2.6.32.13-0.4-default x86_64Build CPU vendor:
   GenuineIntelBuild CPU brand:Intel(R) Xeon(R) CPU   X5650
   @
   2.67GHzBuild CPU family:   6   Model: 44   Stepping: 2Build CPU
   features:
   apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pdcm
  pdpe1gb
   popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3(...)*
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   *Initializing Domain Decomposition on 24 nodesDynamic load balancing:
   autoWill sort the charge groups at every domain
   (re)decompositionInitial
   maximum inter charge-group distances:two-body bonded
   interactions:
   0.621 nm, LJ-14, atoms 3801 3812  multi-body bonded interactions:
   0.621
  nm,
   G96Angle, atoms 3802 3812Minimum cell size due to bonded
   interactions:
   0.683 nmMaximum distance for 5 constraints, at 120 deg. angles,
  all-trans:
   0.820 nmEstimated maximum distance required for P-LINCS: 0.820 nmThis
   distance will limit the DD cell size, you can override this with
  -rconGuess
   for relative PME load: 0.26Will use 18 particle-particle and 6 PME
   only
   nodesThis is a guess, check the performance at the end of the log
  fileUsing
   6 separate PME nodesScaling the initial minimum size with 1/0.8
   (option
   -dds) = 1.25Optimizing the DD grid for 18 cells with a minimum
   initial
  size
   of 1.025 nmThe maximum allowed number of cells is: X 8 Y 8 Z 8Domain
   decomposition grid 3 x 2 x 3, separate PME nodes 6PME domain
  decomposition:
   3 x 2 x 1Interleaving PP and PME nodesThis is a particle-particle
   only
   nodeDomain decomposition nodeid 0, coordinates 0 0 0*
  
  
  
   2014-02-23 18:08 GMT+01:00 Justin Lemkul jalem...@vt.edu:
  
   
   
On 2/23/14, 11:32 

Re: [gmx-users] Long trajectory split

2014-02-23 Thread Justin Lemkul



On 2/23/14, 10:43 AM, Marcelo Depólo wrote:

Hey,


I am running this 1000ns simulation but for some reason mdrun is backing up
the data in multiple files (.edr.1# - .edr.9#, for instance).

Is it a normal behavior?



No, that means rather than launching a parallel mdrun process, you're running 
multiple instances of mdrun that are producing output files that are each 
producing their own output files.  Gromacs tools don't overwrite existing files, 
so they backup existing files of the same name.


-Justin

--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Marcelo Depólo
But it is not quite happening simultaneously, Justin.

It is producing one after another and, consequently, backing up the files.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Justin Lemkul



On 2/23/14, 11:00 AM, Marcelo Depólo wrote:

But it is not quite happening simultaneously, Justin.

It is producing one after another and, consequently, backing up the files.



You'll have to provide the exact commands you're issuing.  Likely you're leaving 
the output names to the default, which causes them to be backed up rather than 
overwritten.


-Justin

--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Dr. Vitaly Chaban
are you sure that your binary is parallel?

how many frames do those trajectory files contain?

Dr. Vitaly V. Chaban


On Sun, Feb 23, 2014 at 5:32 PM, Marcelo Depólo marcelodep...@gmail.com wrote:
 Maybe I should explain it better.

 I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*, pretty
 much a standard line. This job in a batch creates the outputs and, after
 some (random) time, a back up is done and new files are written, but the
 job itself do not finish.


 2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:



 On 2/23/14, 11:00 AM, Marcelo Depólo wrote:

 But it is not quite happening simultaneously, Justin.

 It is producing one after another and, consequently, backing up the files.


 You'll have to provide the exact commands you're issuing.  Likely you're
 leaving the output names to the default, which causes them to be backed up
 rather than overwritten.


 -Justin

 --
 ==

 Justin A. Lemkul, Ph.D.
 Ruth L. Kirschstein NRSA Postdoctoral Fellow

 Department of Pharmaceutical Sciences
 School of Pharmacy
 Health Sciences Facility II, Room 601
 University of Maryland, Baltimore
 20 Penn St.
 Baltimore, MD 21201

 jalem...@outerbanks.umaryland.edu | (410) 706-7441
 http://mackerell.umaryland.edu/~jalemkul

 ==
 --
 Gromacs Users mailing list

 * Please search the archive at http://www.gromacs.org/
 Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a mail to gmx-users-requ...@gromacs.org.




 --
 Marcelo Depólo Polêto
 Uppsala Universitet - Sweden
 Science without Borders - CAPES
 Phone: +46 76 581 67 49
 --
 Gromacs Users mailing list

 * Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
 mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Marcelo Depólo
Pretty sure. I ran other simulations in the same system and worked just
fine.

About the frames, each file contains different number of frames, apparently
random numbers (one file contains 400ns of data and other contains 10ns)


2014-02-23 17:54 GMT+01:00 Dr. Vitaly Chaban vvcha...@gmail.com:

 are you sure that your binary is parallel?

 how many frames do those trajectory files contain?

 Dr. Vitaly V. Chaban


 On Sun, Feb 23, 2014 at 5:32 PM, Marcelo Depólo marcelodep...@gmail.com
 wrote:
  Maybe I should explain it better.
 
  I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*,
 pretty
  much a standard line. This job in a batch creates the outputs and, after
  some (random) time, a back up is done and new files are written, but the
  job itself do not finish.
 
 
  2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:
 
 
 
  On 2/23/14, 11:00 AM, Marcelo Depólo wrote:
 
  But it is not quite happening simultaneously, Justin.
 
  It is producing one after another and, consequently, backing up the
 files.
 
 
  You'll have to provide the exact commands you're issuing.  Likely you're
  leaving the output names to the default, which causes them to be backed
 up
  rather than overwritten.
 
 
  -Justin
 
  --
  ==
 
  Justin A. Lemkul, Ph.D.
  Ruth L. Kirschstein NRSA Postdoctoral Fellow
 
  Department of Pharmaceutical Sciences
  School of Pharmacy
  Health Sciences Facility II, Room 601
  University of Maryland, Baltimore
  20 Penn St.
  Baltimore, MD 21201
 
  jalem...@outerbanks.umaryland.edu | (410) 706-7441
  http://mackerell.umaryland.edu/~jalemkul
 
  ==
  --
  Gromacs Users mailing list
 
  * Please search the archive at http://www.gromacs.org/
  Support/Mailing_Lists/GMX-Users_List before posting!
 
  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
 
  * For (un)subscribe requests visit
  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
  send a mail to gmx-users-requ...@gromacs.org.
 
 
 
 
  --
  Marcelo Depólo Polêto
  Uppsala Universitet - Sweden
  Science without Borders - CAPES
  Phone: +46 76 581 67 49
  --
  Gromacs Users mailing list
 
  * Please search the archive at
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
 posting!
 
  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
 
  * For (un)subscribe requests visit
  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a mail to gmx-users-requ...@gromacs.org.
 --
 Gromacs Users mailing list

 * Please search the archive at
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
 posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a mail to gmx-users-requ...@gromacs.org.




-- 
Marcelo Depólo Polêto
Uppsala Universitet - Sweden
Science without Borders - CAPES
Phone: +46 76 581 67 49
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Justin Lemkul



On 2/23/14, 11:32 AM, Marcelo Depólo wrote:

Maybe I should explain it better.

I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*, pretty
much a standard line. This job in a batch creates the outputs and, after
some (random) time, a back up is done and new files are written, but the
job itself do not finish.



It would help if you can post the .log file from one of the runs to see the 
information regarding mdrun's parallel capabilities.  This still sounds like a 
case of an incorrectly compiled binary.  Do other runs with the same binary 
produce the same problem?


-Justin



2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:




On 2/23/14, 11:00 AM, Marcelo Depólo wrote:


But it is not quite happening simultaneously, Justin.

It is producing one after another and, consequently, backing up the files.



You'll have to provide the exact commands you're issuing.  Likely you're
leaving the output names to the default, which causes them to be backed up
rather than overwritten.


-Justin

--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/
Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.







--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Mark Abraham
Normally an MPI-enabled mdrun would be named mdrun_mpi, and running a
non-MPI mdrun would produce symptoms like yours depending exactly how your
filesystem chooses to do things, so Justin and Vitaly's theory is sound.
Look at the top section of your .log file for what mdrun thinks about MPI!

Mark


On Sun, Feb 23, 2014 at 5:32 PM, Marcelo Depólo marcelodep...@gmail.comwrote:

 Maybe I should explain it better.

 I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*, pretty
 much a standard line. This job in a batch creates the outputs and, after
 some (random) time, a back up is done and new files are written, but the
 job itself do not finish.


 2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:

 
 
  On 2/23/14, 11:00 AM, Marcelo Depólo wrote:
 
  But it is not quite happening simultaneously, Justin.
 
  It is producing one after another and, consequently, backing up the
 files.
 
 
  You'll have to provide the exact commands you're issuing.  Likely you're
  leaving the output names to the default, which causes them to be backed
 up
  rather than overwritten.
 
 
  -Justin
 
  --
  ==
 
  Justin A. Lemkul, Ph.D.
  Ruth L. Kirschstein NRSA Postdoctoral Fellow
 
  Department of Pharmaceutical Sciences
  School of Pharmacy
  Health Sciences Facility II, Room 601
  University of Maryland, Baltimore
  20 Penn St.
  Baltimore, MD 21201
 
  jalem...@outerbanks.umaryland.edu | (410) 706-7441
  http://mackerell.umaryland.edu/~jalemkul
 
  ==
  --
  Gromacs Users mailing list
 
  * Please search the archive at http://www.gromacs.org/
  Support/Mailing_Lists/GMX-Users_List before posting!
 
  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
 
  * For (un)subscribe requests visit
  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
  send a mail to gmx-users-requ...@gromacs.org.
 



 --
 Marcelo Depólo Polêto
 Uppsala Universitet - Sweden
 Science without Borders - CAPES
 Phone: +46 76 581 67 49
 --
 Gromacs Users mailing list

 * Please search the archive at
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
 posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a mail to gmx-users-requ...@gromacs.org.

-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Justin Lemkul



On 2/23/14, 12:10 PM, Marcelo Depólo wrote:

Pretty sure. I ran other simulations in the same system and worked just
fine.

About the frames, each file contains different number of frames, apparently
random numbers (one file contains 400ns of data and other contains 10ns)



What are the starting and ending points of those data?  Is the run re-starting 
or just writing successive time intervals to new files when it shouldn't be?  Do 
you have some limitation on file size that is being reached, causing the new 
files to be generated?


-Justin



2014-02-23 17:54 GMT+01:00 Dr. Vitaly Chaban vvcha...@gmail.com:


are you sure that your binary is parallel?

how many frames do those trajectory files contain?

Dr. Vitaly V. Chaban


On Sun, Feb 23, 2014 at 5:32 PM, Marcelo Depólo marcelodep...@gmail.com
wrote:

Maybe I should explain it better.

I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*,

pretty

much a standard line. This job in a batch creates the outputs and, after
some (random) time, a back up is done and new files are written, but the
job itself do not finish.


2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:




On 2/23/14, 11:00 AM, Marcelo Depólo wrote:


But it is not quite happening simultaneously, Justin.

It is producing one after another and, consequently, backing up the

files.




You'll have to provide the exact commands you're issuing.  Likely you're
leaving the output names to the default, which causes them to be backed

up

rather than overwritten.


-Justin

--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/
Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.





--
Marcelo Depólo Polêto
Uppsala Universitet - Sweden
Science without Borders - CAPES
Phone: +46 76 581 67 49
--
Gromacs Users mailing list

* Please search the archive at

http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!


* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or

send a mail to gmx-users-requ...@gromacs.org.
--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.







--
==

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Long trajectory split

2014-02-23 Thread Marcelo Depólo
 Justin, as far as I realized, the next log file starts at 0ps what would
mean that it is re-starting for some reason. At first, I imagined that it
was only splitting the data among files due to some kind of size limit, as
you said, but when I tried to concatenate the trajectories, it gives me a
non-sense output, with a lot of 'beginnings'.

I will check with the cluster experts if there is some kind of size
limit.It seems to be the most logical source of the problem to me.

Mark, the only difference this time is the time-scale set since the
beginning. Apart from the protein itself, even the .mdp files were copied
from a sucessful folder.

But thank you both for the support.


2014-02-23 20:20 GMT+01:00 Mark Abraham mark.j.abra...@gmail.com:

 On Sun, Feb 23, 2014 at 6:48 PM, Marcelo Depólo marcelodep...@gmail.com
 wrote:

  Justin, the other runs with the very same binary do not produce the same
  problem.
 
  Mark, I just omitted the _mpi of the line here, but is was compiled as
  _mpi.
 

 OK, that rules that problem out, but please don't simplify and approximate.
 Computers are exact, and trouble shooting problems with them requires all
 the information. If we all understood perfectly we wouldn't be having
 problems ;-)

 Those files do get closed at checkpoint intervals, so they can be hashed
 for the hash value to be saved in the checkpoint. It is conceivable some
 file system would not close-and-re-open them properly. The .log files would
 comment about at least some such conditions.

 But the real question is what you are doing differently from the times when
 you have observed normal behaviour!

 Mark


  My log file top:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  *Gromacs version:VERSION 4.6.1Precision:  singleMemory
  model:   64 bitMPI library:MPIOpenMP support: disabledGPU
  support:disabledinvsqrt routine:gmx_software_invsqrt(x)CPU
  acceleration:   SSE4.1FFT library:fftw-3.3.2-sse2Large file
  support: enabledRDTSCP usage:   enabledBuilt on:   Sex Nov 29
  16:08:45 BRST 2013Built by:   root@jupiter [CMAKE]Build
  OS/arch:  Linux 2.6.32.13-0.4-default x86_64Build CPU vendor:
  GenuineIntelBuild CPU brand:Intel(R) Xeon(R) CPU   X5650  @
  2.67GHzBuild CPU family:   6   Model: 44   Stepping: 2Build CPU features:
  apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pdcm
 pdpe1gb
  popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3(...)*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  *Initializing Domain Decomposition on 24 nodesDynamic load balancing:
  autoWill sort the charge groups at every domain (re)decompositionInitial
  maximum inter charge-group distances:two-body bonded interactions:
  0.621 nm, LJ-14, atoms 3801 3812  multi-body bonded interactions: 0.621
 nm,
  G96Angle, atoms 3802 3812Minimum cell size due to bonded interactions:
  0.683 nmMaximum distance for 5 constraints, at 120 deg. angles,
 all-trans:
  0.820 nmEstimated maximum distance required for P-LINCS: 0.820 nmThis
  distance will limit the DD cell size, you can override this with
 -rconGuess
  for relative PME load: 0.26Will use 18 particle-particle and 6 PME only
  nodesThis is a guess, check the performance at the end of the log
 fileUsing
  6 separate PME nodesScaling the initial minimum size with 1/0.8 (option
  -dds) = 1.25Optimizing the DD grid for 18 cells with a minimum initial
 size
  of 1.025 nmThe maximum allowed number of cells is: X 8 Y 8 Z 8Domain
  decomposition grid 3 x 2 x 3, separate PME nodes 6PME domain
 decomposition:
  3 x 2 x 1Interleaving PP and PME nodesThis is a particle-particle only
  nodeDomain decomposition nodeid 0, coordinates 0 0 0*
 
 
 
  2014-02-23 18:08 GMT+01:00 Justin Lemkul jalem...@vt.edu:
 
  
  
   On 2/23/14, 11:32 AM, Marcelo Depólo wrote:
  
   Maybe I should explain it better.
  
   I am using *mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*,
   pretty
  
   much a standard line. This job in a batch creates the outputs and,
 after
   some (random) time, a back up is done and new files are written, but
 the
   job itself do not finish.
  
  
   It would help if you can post the .log file from one of the runs to see
   the information regarding mdrun's parallel capabilities.  This still
  sounds
   like a case of an incorrectly compiled binary.  Do other runs with the
  same
   binary produce the same problem?
  
   -Justin
  
  
  
   2014-02-23 17:12 GMT+01:00 Justin Lemkul jalem...@vt.edu:
  
  
  
   On 2/23/14, 11:00 AM, Marcelo Depólo wrote:
  
But it is not quite happening simultaneously, Justin.
  
   It is producing one after another and, consequently, backing up the
   files.
  
  
You'll have to provide the exact commands you're issuing.  Likely
   you're
   leaving the output names to the default, which causes them to be
 backed
   up
   rather than overwritten.
  
  
   -Justin
  
   --
   ==
  
   Justin A.