Re: [gmx-users] What is the most reliable way to run repeats for reproducibility?
On 1/10/18 9:59 AM, ZHANG Cheng wrote: Hi Mark, Thank you very much. ) For the link you provide, I think I could not manipulate most of the computer resources, as I submit my jobs to our cluster, and the jobs are distributed to different available cores randomly. ) For "random seed" of velocity, I found here and I enabled this option: gen_vel = yes http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/lysozyme/06_equil.html So does it mean that it is better to use the same em.tpr and run different NVT,NPT,etc. for different repeats, so as to initialise it with different velocities? This is common practice. Minimize the system once (there's usually no variation here, the coordinates always move downhill on the potential energy gradient) and then initiate however many simulations you want with different starting velocities (requiring different gen_seed values). This generates independent simulations. ) How the "natural chaotic divergence during equilibration" is reflected at which step? The link says: "The Central Limit Theorem tells us that in the case of infinitely long simulation all observables converge to their equilibrium values". But I think this "equilibrium" is not practical for protein in MD. For example, if I am running a protein at 370K, ultimately it will unfold, like boiling an egg in water, it takes 10 min. But in MD, the time scale is way more shorter, i.e. usually a few hundred ns scale. We could "never" see the proteins converges within that short period. So my understanding about "equilibrium" is the equilibration for temperature/pressure/density, but not the protein itself. Is that correct? Yes, quantities like temperature and pressure converge relatively quickly, but the dynamics of the system tend to take much longer, orders of magnitude. -Justin -- == Justin A. Lemkul, Ph.D. Assistant Professor Virginia Tech Department of Biochemistry 303 Engel Hall 340 West Campus Dr. Blacksburg, VA 24061 jalem...@vt.edu | (540) 231-3129 http://www.biochem.vt.edu/people/faculty/JustinLemkul.html == -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] What is the most reliable way to run repeats for reproducibility?
Hi Mark, Thank you very much. ) For the link you provide, I think I could not manipulate most of the computer resources, as I submit my jobs to our cluster, and the jobs are distributed to different available cores randomly. ) For "random seed" of velocity, I found here and I enabled this option: gen_vel = yes http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/lysozyme/06_equil.html So does it mean that it is better to use the same em.tpr and run different NVT,NPT,etc. for different repeats, so as to initialise it with different velocities? ) How the "natural chaotic divergence during equilibration" is reflected at which step? The link says: "The Central Limit Theorem tells us that in the case of infinitely long simulation all observables converge to their equilibrium values". But I think this "equilibrium" is not practical for protein in MD. For example, if I am running a protein at 370K, ultimately it will unfold, like boiling an egg in water, it takes 10 min. But in MD, the time scale is way more shorter, i.e. usually a few hundred ns scale. We could "never" see the proteins converges within that short period. So my understanding about "equilibrium" is the equilibration for temperature/pressure/density, but not the protein itself. Is that correct? http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/lysozyme/06_equil.html http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/lysozyme/07_equil2.html Yours sincerely Cheng -- Original -- From: "ZHANG Cheng";<272699...@qq.com>; Date: Wed, Jan 10, 2018 09:11 PM To: "gromacs.org_gmx-users"; Subject: What is the most reliable way to run repeats for reproducibility? Dear Gromacs, I can think of different ways of running repeats, after reading Justin's lysozyme tutorial. The 1st way: all starting from the same em.tpr after energy minimization (EM) and use em.tpr individually for subsequent steps (NVT, NPT and production MD): ) repeat 1: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD ) repeat 2: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD ) repeat 3: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD .. The 2nd way: all starting from the same md_0_1.tpr and use it for different production MD: ) repeat 1: same em.tpr ?? same NVT ?? same NPT ?? same md_0_1.tpr?? production MD ) repeat 2: same md_0_1.tpr?? production MD ) repeat 3: same md_0_1.tpr?? production MD .. The 3rd way: all starting from the same check point file within the production run and use it for the rest of the production MD: ) repeat 1: same em.tpr ?? same NVT ?? same NPT ?? same md_0_1.tpr?? same production MD for 50 ns ?? same .cpt file ?? production MD for another 200 ns ) repeat 2: same .cpt file ?? production MD for another 200 ns ) repeat 3: same .cpt file ?? production MD for another 200 ns .. Of course, the 3rd way is easier. But does it mean it may not cover enough conformations, as they tend to be more resembled from each other than the 1st approach? Is there a standard way to handle the repeats? Thank you. Yours sincerely Cheng -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] What is the most reliable way to run repeats for reproducibility?
Hi, See http://www.gromacs.org/Documentation/Terminology/Reproducibility. Some people think they want reproducibility of a trajectory, which is generally not needed, and not consistent with highly efficient sampling (particularly with GPUs involved). But what you actually want is not reproducibility. The usual approach is to change the random seed used when you first generate velocities, and rely on the natural chaotic divergence during equilibration to lead to independent sampling in the replicas. Mark On Wed, Jan 10, 2018 at 2:12 PM ZHANG Cheng <272699...@qq.com> wrote: > Dear Gromacs, > I can think of different ways of running repeats, after reading Justin's > lysozyme tutorial. > > > The 1st way: all starting from the same em.tpr after energy minimization > (EM) and use em.tpr individually for subsequent steps (NVT, NPT and > production MD): > ) repeat 1: same em.tpr → NVT → NPT → md_0_1.tpr→ production MD > ) repeat 2: same em.tpr → NVT → NPT → md_0_1.tpr→ production MD > ) repeat 3: same em.tpr → NVT → NPT → md_0_1.tpr→ production MD > .. > > > The 2nd way: all starting from the same md_0_1.tpr and use it for > different production MD: > ) repeat 1: same em.tpr → same NVT → same NPT → same md_0_1.tpr→ > production MD > ) repeat 2: same md_0_1.tpr→ production MD > ) repeat 3: same md_0_1.tpr→ production MD > .. > > > > The 3rd way: all starting from the same check point file within the > production run and use it for the rest of the production MD: > ) repeat 1: same em.tpr → same NVT → same NPT → same md_0_1.tpr→ same > production MD for 50 ns → same .cpt file → production MD for another 200 ns > ) repeat 2: same .cpt file → production MD for another 200 ns > ) repeat 3: same .cpt file → production MD for another 200 ns > .. > > > > Of course, the 3rd way is easier. But does it mean it may not cover enough > conformations, as they tend to be more resembled from each other than the > 1st approach? Is there a standard way to handle the repeats? > > > Thank you. > > > Yours sincerely > Cheng > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] What is the most reliable way to run repeats for reproducibility?
Dear Gromacs, I can think of different ways of running repeats, after reading Justin's lysozyme tutorial. The 1st way: all starting from the same em.tpr after energy minimization (EM) and use em.tpr individually for subsequent steps (NVT, NPT and production MD): ) repeat 1: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD ) repeat 2: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD ) repeat 3: same em.tpr ?? NVT ?? NPT ?? md_0_1.tpr?? production MD .. The 2nd way: all starting from the same md_0_1.tpr and use it for different production MD: ) repeat 1: same em.tpr ?? same NVT ?? same NPT ?? same md_0_1.tpr?? production MD ) repeat 2: same md_0_1.tpr?? production MD ) repeat 3: same md_0_1.tpr?? production MD .. The 3rd way: all starting from the same check point file within the production run and use it for the rest of the production MD: ) repeat 1: same em.tpr ?? same NVT ?? same NPT ?? same md_0_1.tpr?? same production MD for 50 ns ?? same .cpt file ?? production MD for another 200 ns ) repeat 2: same .cpt file ?? production MD for another 200 ns ) repeat 3: same .cpt file ?? production MD for another 200 ns .. Of course, the 3rd way is easier. But does it mean it may not cover enough conformations, as they tend to be more resembled from each other than the 1st approach? Is there a standard way to handle the repeats? Thank you. Yours sincerely Cheng -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.