Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Mark Abraham
On Thu, Jan 22, 2015 at 8:36 PM, Mark Abraham mark.j.abra...@gmail.com
wrote:



 On Thu, Jan 22, 2015 at 8:34 PM, Mark Abraham mark.j.abra...@gmail.com
 wrote:



 On Thu, Jan 22, 2015 at 8:09 PM, Jiaqi Lin jq...@mit.edu wrote:

 Hi Mark

 Thank you for your patient reply.

 I've tried using the exact same set up (rcoulombic =3.258 fourierspacing
 = 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of would
 have changed the parameters to, and still it won't reproduce the result
 obtained by auto-tune, just the result I tired previously using similar
 parameters. Therefore, I believe there is something going on in the code of
 auto-tune that is not what it is appear to be, or, the auto-tune change the
 parameters in a special way that in theory shouldn't affect simulation
 result but in my case the system is very sensitive to these changes. As you
 pointed out, the code is not straightforward, therefore much can happen at
 this stage. I'd very much like to figure out what going on, but spending
 two months on the code is not something I'm willing to do.


 One test you can do that is straightforward is to do an auto-tuning run
 and write the final output whenever the tuning completes.


 Actually, get that frame and another one later. I don't remember whether
 the e.g. step-1000 energy output is with the last-tried or
 newly-decided-as-optimal PME regime.


And use nstfout to get the full-precision forces in both frames for both
runs, too. A problem could lie in either part, in principle.

Mark


 That trajectory should have only one frame in it. Then do a
 http://www.gromacs.org/Documentation/How-tos/Single-Point_Energy
 calculation on that trajectory file, using a .tpr that should match the
 auto-tuning result. If that gave different energy components, then I think
 that would be good evidence of a code bug. (Either way, as I've already
 said, you should just go and use the Verlet scheme for running this system
 on that amount of hardware...)


 So I think I will keep changing parameters and try to understand long
 range electrostatics in my system. Also, the simulation produced by
 auto-tune is unlikely to be a outlier. I kept changing -ntpme and when
 auto-tune give me rcoulombic larger than 3, I can always observe the
 interesting result.


 That is suggestive of a bug. If you can also reproduce it from runs that
 have different initial velocities, then that would be quite strong evidence.

 I also tired LJPME as you suggested for long range vdw, it compress the
 bilayer to a large extend.


 What force field was that?


 The whole force fields has to be refined if I use LJPME, but that's
 another story. I think electrostatics is the main driving force in my
 simulation system, and I'll keep trying to figure out what really happened.


 Hmm, that sounds like your model physics might be over-parameterized on
 an invalid van der Waals approximation. I'd give serious consideration to
 reviewing its performance on other problems quite carefully. Using it only
 in a regime and/or software package where it is thought to be valid would
 be the standard advice.

 Anyway, there is no way that
 * if the model physics is correct and not over-tuned to the exact regime
 and software that produced it, and
 * if the GROMACS code is correct,
 that the simulation behaviour could depend in a statistically significant
 way on the tuning or -npme choice. Those values do not enter the model
 physics.

 Just changing electrostatics regimes arbitrarily and cherry-picking some
 of the results to analyse is going to get questions raised about your work
 - either the model physics is not robust enough to use on your system, or
 the software is wrong, or you're ignoring valid behaviour because something
 else looks more interesting.

 Mark


 Best
 Jiaqi


 On 1/20/2015 7:18 PM, Mark Abraham wrote:

 On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin jq...@mit.edu wrote:

  Hi Szilard,

 - I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
 better, but in what way?

 - I've tired Verlet scheme and it gives small change of cutoff and
 grid.
 But what I really interested is to manually reproduce the result that
 tune_pme give me in the first case using Group scheme.

  If that's the only situation you can observe it, it could be an
 outlier or
 your method could be unsuitable...


  - I've also tired lincs-roder=4 and it makes no difference.

 - My Fourier spacing is 25% finer, but that shouldn't affect the
 results
 right? if it do affect the results, then I want to find out how.

  It affects your results because you could do some more sampling with
 the
 computer time you are spending on PME at the moment. Where to choose
 your
 balance of model quality vs amount of sampling is poorly understood, and
 problem-dependent.

 - I happen to use the PP/PME rank split (100+28) and it gives me

 interesting results (speed of performance is not bad actually). Then
 I'm
 very interested in how these cutoff and 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Mark Abraham
On Thu, Jan 22, 2015 at 10:08 PM, Jiaqi Lin jq...@mit.edu wrote:

 Hi Mark,

 I've reproduced runs that have different initial velocities and also
 different setups, e.g. different kinds of NPs.

 The force field is MARTINI coarse-grained force field with polarizable
 water.


OK, that's great - but are you starting from the nonbonded regime that was
used for its parametrization? If not, why not?


 The compression of lipid bilayer might because of the particular
 simulation setup in my system e.g. ion imbalance. I believe the force field
 itself has no problems with the van der Waals approximation. I can do a
 simple test and see.

 I didn't set the electrostatic regime arbitrarily. Auto-tune did it for me
 (it is set as default in 4.6.5 ),


It didn't set it arbitrarily either, it preserved the quality of the
approximation to the full electrostatics treatment (assuming no bug).


 and I didn't know it until I found out system behavior is very depend on
 the rcoulomb and I want to know how and why.


Sure - but it's not just rcoulomb, as you know.


 That's why I'm trying to control these parameters and reproduce the result
 to start. I am not cherry-picking the result, currently I'm withholding the
 results from publishing until I found out what's really going on.


Great


 Either I have to change the model system or I have to use a total new one,
 I'll do it accordingly. But first of all I have to make sure it is not a
 problem of the code. Moreover, there are other ways (e.g. changing some
 initial conditions of the system) to observe the interesting result in
 normal simulation setup (-notunepme, Verlet).


OK, that's good to know - that suggests there's not a problem, and that
different initial conditions might just do different things.

That means I have to redo all the simulations, and that is a huge amount of
 work.

 I'll try to do the test as you suggested and see if it is a bug.


Thanks! You'd need to do the force comparisons with gmxcheck, perhaps after
creative use of trjconv so that you can compare the frames that should be
comparable.

Mark


 Thank you
 Jiaqi


 On 1/22/2015 2:34 PM, Mark Abraham wrote:

 On Thu, Jan 22, 2015 at 8:09 PM, Jiaqi Lin jq...@mit.edu wrote:

  Hi Mark

 Thank you for your patient reply.

 I've tried using the exact same set up (rcoulombic =3.258 fourierspacing
 =
 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of would
 have changed the parameters to, and still it won't reproduce the result
 obtained by auto-tune, just the result I tired previously using similar
 parameters. Therefore, I believe there is something going on in the code
 of
 auto-tune that is not what it is appear to be, or, the auto-tune change
 the
 parameters in a special way that in theory shouldn't affect simulation
 result but in my case the system is very sensitive to these changes. As
 you
 pointed out, the code is not straightforward, therefore much can happen
 at
 this stage. I'd very much like to figure out what going on, but spending
 two months on the code is not something I'm willing to do.


 One test you can do that is straightforward is to do an auto-tuning run
 and
 write the final output whenever the tuning completes. That trajectory
 should have only one frame in it. Then do a
 http://www.gromacs.org/Documentation/How-tos/Single-Point_Energy
 calculation on that trajectory file, using a .tpr that should match the
 auto-tuning result. If that gave different energy components, then I think
 that would be good evidence of a code bug. (Either way, as I've already
 said, you should just go and use the Verlet scheme for running this system
 on that amount of hardware...)


  So I think I will keep changing parameters and try to understand long
 range electrostatics in my system. Also, the simulation produced by
 auto-tune is unlikely to be a outlier. I kept changing -ntpme and when
 auto-tune give me rcoulombic larger than 3, I can always observe the
 interesting result.

  That is suggestive of a bug. If you can also reproduce it from runs that
 have different initial velocities, then that would be quite strong
 evidence.

 I also tired LJPME as you suggested for long range vdw, it compress the

 bilayer to a large extend.


 What force field was that?


  The whole force fields has to be refined if I use LJPME, but that's
 another story. I think electrostatics is the main driving force in my
 simulation system, and I'll keep trying to figure out what really
 happened.

  Hmm, that sounds like your model physics might be over-parameterized on
 an
 invalid van der Waals approximation. I'd give serious consideration to
 reviewing its performance on other problems quite carefully. Using it only
 in a regime and/or software package where it is thought to be valid would
 be the standard advice.

 Anyway, there is no way that
 * if the model physics is correct and not over-tuned to the exact regime
 and software that produced it, and
 * if the GROMACS code is correct,
 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Jiaqi Lin

Hi Mark

Thank you for your patient reply.

I've tried using the exact same set up (rcoulombic =3.258 fourierspacing 
= 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of 
would have changed the parameters to, and still it won't reproduce the 
result obtained by auto-tune, just the result I tired previously using 
similar parameters. Therefore, I believe there is something going on in 
the code of auto-tune that is not what it is appear to be, or, the 
auto-tune change the parameters in a special way that in theory 
shouldn't affect simulation result but in my case the system is very 
sensitive to these changes. As you pointed out, the code is not 
straightforward, therefore much can happen at this stage. I'd very much 
like to figure out what going on, but spending two months on the code is 
not something I'm willing to do. So I think I will keep changing 
parameters and try to understand long range electrostatics in my system. 
Also, the simulation produced by auto-tune is unlikely to be a outlier. 
I kept changing -ntpme and when auto-tune give me rcoulombic larger than 
3, I can always observe the interesting result.


I also tired LJPME as you suggested for long range vdw, it compress the 
bilayer to a large extend. The whole force fields has to be refined if I 
use LJPME, but that's another story. I think electrostatics is the main 
driving force in my simulation system, and I'll keep trying to figure 
out what really happened.


Best
Jiaqi

On 1/20/2015 7:18 PM, Mark Abraham wrote:

On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin jq...@mit.edu wrote:


Hi Szilard,

- I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
better, but in what way?

- I've tired Verlet scheme and it gives small change of cutoff and grid.
But what I really interested is to manually reproduce the result that
tune_pme give me in the first case using Group scheme.


If that's the only situation you can observe it, it could be an outlier or
your method could be unsuitable...



- I've also tired lincs-roder=4 and it makes no difference.

- My Fourier spacing is 25% finer, but that shouldn't affect the results
right? if it do affect the results, then I want to find out how.


It affects your results because you could do some more sampling with the
computer time you are spending on PME at the moment. Where to choose your
balance of model quality vs amount of sampling is poorly understood, and
problem-dependent.

- I happen to use the PP/PME rank split (100+28) and it gives me

interesting results (speed of performance is not bad actually). Then I'm
very interested in how these cutoff and grid setting can affect my
simulation results.


If the implementation and model is right, then they have no effect. That's
why we auto-tune with them. You're going to need a lot of replicates to
show that -npme 28 gives a statistically different result from a different
value, and you won't yet have established that it's somehow a more valid
observation to pursue.

So I tried to manually control the parameter (turning off tune_PME). But no

matter how I tired, I can not reproduce the result given by tune_pme. So my
biggest question is, how does tune_PME implemented in the code? What
parameters does it actually tuned?


Like it says, it varies rcoulomb and the Fourier grid, keeping rvdw and
beta fixed. Details of how rlist and rlistlong behave are a bit messy, but
nothing you asked for is ever missed out.



- When PME tuned the cutoff to such large value, the speed does not goes
down noticeably. So what I suspect is that tune_PME


Please be careful with names. gmx tune_pme is a different thing from the
mdrun auto-tuning.



does the direct space calculation without changing the neighbor list
search distance.


Your auto-tuned run number 1 had rlist = rcoulomb at the start, so mdrun
knows you wanted a PME model with an unbuffered list whose size equals
rcoulomb, and a buffered VDW model with rlist 1.4 and rvdw 1.2. Thus,
rlistlong will stay equal to rcoulomb as it changes. The details and code
are horrible, though, and I am looking forward to nothing so much as
ripping it all out in about 2 months!

And like Szilard suggested, your runs are probably a long way from maximum
throughput. Aim for lots of sampling, don't chase replicating rare events
with brute-force simulation!

Mark

Thank you

Best
Jiaqi


On 1/20/2015 3:54 PM, Szilárd Páll wrote:


Not (all) directly related, but a few comments/questions:

- Have you tried 4.6.7 or 5.0.4?
- Have you considered using the Verlet scheme instead of doing manual
buffering?
- lincs-order=8 is very large for 2fs production runs - typically 4 is
used.
- Your fourier spacing is a lot (~25%) finer than it needs to be.

- The PP/PME rank split of 100+28 is _very_ inconvenient and it is the
main cause of the horrible PME performance together with the overly
coarse grid. That's why you get such a huge cut-off after the PP-PME
load balancing. Even if you want to stick to these 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Mark Abraham
On Thu, Jan 22, 2015 at 8:34 PM, Mark Abraham mark.j.abra...@gmail.com
wrote:



 On Thu, Jan 22, 2015 at 8:09 PM, Jiaqi Lin jq...@mit.edu wrote:

 Hi Mark

 Thank you for your patient reply.

 I've tried using the exact same set up (rcoulombic =3.258 fourierspacing
 = 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of would
 have changed the parameters to, and still it won't reproduce the result
 obtained by auto-tune, just the result I tired previously using similar
 parameters. Therefore, I believe there is something going on in the code of
 auto-tune that is not what it is appear to be, or, the auto-tune change the
 parameters in a special way that in theory shouldn't affect simulation
 result but in my case the system is very sensitive to these changes. As you
 pointed out, the code is not straightforward, therefore much can happen at
 this stage. I'd very much like to figure out what going on, but spending
 two months on the code is not something I'm willing to do.


 One test you can do that is straightforward is to do an auto-tuning run
 and write the final output whenever the tuning completes.


Actually, get that frame and another one later. I don't remember whether
the e.g. step-1000 energy output is with the last-tried or
newly-decided-as-optimal PME regime.


 That trajectory should have only one frame in it. Then do a
 http://www.gromacs.org/Documentation/How-tos/Single-Point_Energy
 calculation on that trajectory file, using a .tpr that should match the
 auto-tuning result. If that gave different energy components, then I think
 that would be good evidence of a code bug. (Either way, as I've already
 said, you should just go and use the Verlet scheme for running this system
 on that amount of hardware...)


 So I think I will keep changing parameters and try to understand long
 range electrostatics in my system. Also, the simulation produced by
 auto-tune is unlikely to be a outlier. I kept changing -ntpme and when
 auto-tune give me rcoulombic larger than 3, I can always observe the
 interesting result.


 That is suggestive of a bug. If you can also reproduce it from runs that
 have different initial velocities, then that would be quite strong evidence.

 I also tired LJPME as you suggested for long range vdw, it compress the
 bilayer to a large extend.


 What force field was that?


 The whole force fields has to be refined if I use LJPME, but that's
 another story. I think electrostatics is the main driving force in my
 simulation system, and I'll keep trying to figure out what really happened.


 Hmm, that sounds like your model physics might be over-parameterized on an
 invalid van der Waals approximation. I'd give serious consideration to
 reviewing its performance on other problems quite carefully. Using it only
 in a regime and/or software package where it is thought to be valid would
 be the standard advice.

 Anyway, there is no way that
 * if the model physics is correct and not over-tuned to the exact regime
 and software that produced it, and
 * if the GROMACS code is correct,
 that the simulation behaviour could depend in a statistically significant
 way on the tuning or -npme choice. Those values do not enter the model
 physics.

 Just changing electrostatics regimes arbitrarily and cherry-picking some
 of the results to analyse is going to get questions raised about your work
 - either the model physics is not robust enough to use on your system, or
 the software is wrong, or you're ignoring valid behaviour because something
 else looks more interesting.

 Mark


 Best
 Jiaqi


 On 1/20/2015 7:18 PM, Mark Abraham wrote:

 On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin jq...@mit.edu wrote:

  Hi Szilard,

 - I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
 better, but in what way?

 - I've tired Verlet scheme and it gives small change of cutoff and grid.
 But what I really interested is to manually reproduce the result that
 tune_pme give me in the first case using Group scheme.

  If that's the only situation you can observe it, it could be an
 outlier or
 your method could be unsuitable...


  - I've also tired lincs-roder=4 and it makes no difference.

 - My Fourier spacing is 25% finer, but that shouldn't affect the results
 right? if it do affect the results, then I want to find out how.

  It affects your results because you could do some more sampling with
 the
 computer time you are spending on PME at the moment. Where to choose your
 balance of model quality vs amount of sampling is poorly understood, and
 problem-dependent.

 - I happen to use the PP/PME rank split (100+28) and it gives me

 interesting results (speed of performance is not bad actually). Then I'm
 very interested in how these cutoff and grid setting can affect my
 simulation results.


 If the implementation and model is right, then they have no effect.
 That's
 why we auto-tune with them. You're going to need a lot of replicates to
 show that -npme 28 gives a 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Mark Abraham
On Thu, Jan 22, 2015 at 8:09 PM, Jiaqi Lin jq...@mit.edu wrote:

 Hi Mark

 Thank you for your patient reply.

 I've tried using the exact same set up (rcoulombic =3.258 fourierspacing =
 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of would
 have changed the parameters to, and still it won't reproduce the result
 obtained by auto-tune, just the result I tired previously using similar
 parameters. Therefore, I believe there is something going on in the code of
 auto-tune that is not what it is appear to be, or, the auto-tune change the
 parameters in a special way that in theory shouldn't affect simulation
 result but in my case the system is very sensitive to these changes. As you
 pointed out, the code is not straightforward, therefore much can happen at
 this stage. I'd very much like to figure out what going on, but spending
 two months on the code is not something I'm willing to do.


One test you can do that is straightforward is to do an auto-tuning run and
write the final output whenever the tuning completes. That trajectory
should have only one frame in it. Then do a
http://www.gromacs.org/Documentation/How-tos/Single-Point_Energy
calculation on that trajectory file, using a .tpr that should match the
auto-tuning result. If that gave different energy components, then I think
that would be good evidence of a code bug. (Either way, as I've already
said, you should just go and use the Verlet scheme for running this system
on that amount of hardware...)


 So I think I will keep changing parameters and try to understand long
 range electrostatics in my system. Also, the simulation produced by
 auto-tune is unlikely to be a outlier. I kept changing -ntpme and when
 auto-tune give me rcoulombic larger than 3, I can always observe the
 interesting result.


That is suggestive of a bug. If you can also reproduce it from runs that
have different initial velocities, then that would be quite strong evidence.

I also tired LJPME as you suggested for long range vdw, it compress the
 bilayer to a large extend.


What force field was that?


 The whole force fields has to be refined if I use LJPME, but that's
 another story. I think electrostatics is the main driving force in my
 simulation system, and I'll keep trying to figure out what really happened.


Hmm, that sounds like your model physics might be over-parameterized on an
invalid van der Waals approximation. I'd give serious consideration to
reviewing its performance on other problems quite carefully. Using it only
in a regime and/or software package where it is thought to be valid would
be the standard advice.

Anyway, there is no way that
* if the model physics is correct and not over-tuned to the exact regime
and software that produced it, and
* if the GROMACS code is correct,
that the simulation behaviour could depend in a statistically significant
way on the tuning or -npme choice. Those values do not enter the model
physics.

Just changing electrostatics regimes arbitrarily and cherry-picking some of
the results to analyse is going to get questions raised about your work -
either the model physics is not robust enough to use on your system, or the
software is wrong, or you're ignoring valid behaviour because something
else looks more interesting.

Mark


 Best
 Jiaqi


 On 1/20/2015 7:18 PM, Mark Abraham wrote:

 On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin jq...@mit.edu wrote:

  Hi Szilard,

 - I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
 better, but in what way?

 - I've tired Verlet scheme and it gives small change of cutoff and grid.
 But what I really interested is to manually reproduce the result that
 tune_pme give me in the first case using Group scheme.

  If that's the only situation you can observe it, it could be an outlier
 or
 your method could be unsuitable...


  - I've also tired lincs-roder=4 and it makes no difference.

 - My Fourier spacing is 25% finer, but that shouldn't affect the results
 right? if it do affect the results, then I want to find out how.

  It affects your results because you could do some more sampling with the
 computer time you are spending on PME at the moment. Where to choose your
 balance of model quality vs amount of sampling is poorly understood, and
 problem-dependent.

 - I happen to use the PP/PME rank split (100+28) and it gives me

 interesting results (speed of performance is not bad actually). Then I'm
 very interested in how these cutoff and grid setting can affect my
 simulation results.


 If the implementation and model is right, then they have no effect. That's
 why we auto-tune with them. You're going to need a lot of replicates to
 show that -npme 28 gives a statistically different result from a different
 value, and you won't yet have established that it's somehow a more valid
 observation to pursue.

 So I tried to manually control the parameter (turning off tune_PME). But
 no

 matter how I tired, I can not reproduce the result given by 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-22 Thread Jiaqi Lin

Hi Mark,

I've reproduced runs that have different initial velocities and also 
different setups, e.g. different kinds of NPs.


The force field is MARTINI coarse-grained force field with polarizable 
water. The compression of lipid bilayer might because of the particular 
simulation setup in my system e.g. ion imbalance. I believe the force 
field itself has no problems with the van der Waals approximation. I can 
do a simple test and see.


I didn't set the electrostatic regime arbitrarily. Auto-tune did it for 
me (it is set as default in 4.6.5 ), and I didn't know it until I found 
out system behavior is very depend on the rcoulomb and I want to know 
how and why. That's why I'm trying to control these parameters and 
reproduce the result to start. I am not cherry-picking the result, 
currently I'm withholding the results from publishing until I found out 
what's really going on. Either I have to change the model system or I 
have to use a total new one, I'll do it accordingly. But first of all I 
have to make sure it is not a problem of the code. Moreover, there are 
other ways (e.g. changing some initial conditions of the system) to 
observe the interesting result in normal simulation setup 
(-notunepme, Verlet). That means I have to redo all the simulations, and 
that is a huge amount of work.


I'll try to do the test as you suggested and see if it is a bug.

Thank you
Jiaqi

On 1/22/2015 2:34 PM, Mark Abraham wrote:

On Thu, Jan 22, 2015 at 8:09 PM, Jiaqi Lin jq...@mit.edu wrote:


Hi Mark

Thank you for your patient reply.

I've tried using the exact same set up (rcoulombic =3.258 fourierspacing =
0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of would
have changed the parameters to, and still it won't reproduce the result
obtained by auto-tune, just the result I tired previously using similar
parameters. Therefore, I believe there is something going on in the code of
auto-tune that is not what it is appear to be, or, the auto-tune change the
parameters in a special way that in theory shouldn't affect simulation
result but in my case the system is very sensitive to these changes. As you
pointed out, the code is not straightforward, therefore much can happen at
this stage. I'd very much like to figure out what going on, but spending
two months on the code is not something I'm willing to do.


One test you can do that is straightforward is to do an auto-tuning run and
write the final output whenever the tuning completes. That trajectory
should have only one frame in it. Then do a
http://www.gromacs.org/Documentation/How-tos/Single-Point_Energy
calculation on that trajectory file, using a .tpr that should match the
auto-tuning result. If that gave different energy components, then I think
that would be good evidence of a code bug. (Either way, as I've already
said, you should just go and use the Verlet scheme for running this system
on that amount of hardware...)



So I think I will keep changing parameters and try to understand long
range electrostatics in my system. Also, the simulation produced by
auto-tune is unlikely to be a outlier. I kept changing -ntpme and when
auto-tune give me rcoulombic larger than 3, I can always observe the
interesting result.


That is suggestive of a bug. If you can also reproduce it from runs that
have different initial velocities, then that would be quite strong evidence.

I also tired LJPME as you suggested for long range vdw, it compress the

bilayer to a large extend.


What force field was that?



The whole force fields has to be refined if I use LJPME, but that's
another story. I think electrostatics is the main driving force in my
simulation system, and I'll keep trying to figure out what really happened.


Hmm, that sounds like your model physics might be over-parameterized on an
invalid van der Waals approximation. I'd give serious consideration to
reviewing its performance on other problems quite carefully. Using it only
in a regime and/or software package where it is thought to be valid would
be the standard advice.

Anyway, there is no way that
* if the model physics is correct and not over-tuned to the exact regime
and software that produced it, and
* if the GROMACS code is correct,
that the simulation behaviour could depend in a statistically significant
way on the tuning or -npme choice. Those values do not enter the model
physics.

Just changing electrostatics regimes arbitrarily and cherry-picking some of
the results to analyse is going to get questions raised about your work -
either the model physics is not robust enough to use on your system, or the
software is wrong, or you're ignoring valid behaviour because something
else looks more interesting.

Mark



Best
Jiaqi


On 1/20/2015 7:18 PM, Mark Abraham wrote:


On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin jq...@mit.edu wrote:

  Hi Szilard,

- I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
better, but in what way?

- I've tired Verlet scheme and it gives 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-20 Thread Jiaqi Lin

Hi Mark,

Thanks for reply. I put the md.log files in the following link

https://www.dropbox.com/sh/d1d2fbwreizr974/AABYhSRU03nmijbTIXKKr-rra?dl=0

There are four log files
 1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
 2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
 3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
 4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14

Note that the LR Coulombic energy in the first one is almost twice the 
value of that in the second one, whereas the grid spacing in both cases 
are nealy the same.


Only the first one gives a strong electrostatic interaction of a 
nanoparticle with a lipid bilayer under ionic imbalance. In other cases 
I do not observe such a strong interaction.


GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks


Regards
Jiaqi



On 1/19/2015 3:22 PM, Mark Abraham wrote:

On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin jq...@mit.edu wrote:


Dear GMX developers,

I've encounter a problem in GROMACS concerning the auto-tuning feature of
PME that bugged me for months. As stated in the title, the auto-tuning
feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
in md.log) when I set -npme to be 28 (128 total CPU cores), and this giving
me interesting simulation results. When I use -notunepme, I found Coulomb
(SR) and recip. giving me same energy but the actual simulation result is
different. This i can understand: scaling between coulombic cut-off/grid
size theoretically give same accuracy to electrostatics (according to GMX
manual and PME papers), but there actually some numerical error due to grid
mapping and even if the energy is the same that does not mean system
configuration has to be the same (NVE ensemble: constant energy, different
configuration).


Total electrostatic energy should be approximately the same with different
PME partitions.



However the thing i don't understand is the following. I am interested in
the result under large coulomb cut-off, so I try to manually set cut-off
and grid space with -notunepme, using the value tuned by mdrun previously.
This give me complete different simulation result, and the energy is also
different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
(~3.3) still does not give me the result produced by auto-tuning PME.


In what sense is the result different?



In addition, simulation speed dramatically reduces when I set rcoulomb to
be ~3.3 (using -tunepme the speed remains nearly the same no matter how
large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
5.0.1, same thing happens, so clearly it's not because of versions. Thus
the question is: what exactly happened to PME calcualtion using the
auto-tuning feature in mdrun, why it does give different results when I
manually set the coulomb cutoff and grid space to the value tuned by mdrun
without the auto-tuning feature (using -notunepme)? Thank you for help.


For the group scheme, these should all lead to essentially the same result
and (if tuned) performance. If you can share your various log files on a
file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5 and
5.0.1) then we can be in a position to comment further.

Mark



additional info: I use Group cutoff-scheme , rvdw is  1.2.


  md.log file:
DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627

step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4
M-cycles
step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3
M-cycles
step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6
M-cycles
step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2
M-cycles
step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0
M-cycles
step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0
M-cycles
step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7
M-cycles
step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8
M-cycles
Grid: 12 x 14 x 14 cells
step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 346.2
M-cycles
step  190: the PME grid restriction limits the PME load balancing to a
coulomb cut-off of 4.392
step  210: timed with pme grid 128 128 192, coulomb cutoff 2.846: 360.6
M-cycles
step  230: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.6
M-cycles
step  250: timed with pme grid 120 120 160, coulomb cutoff 3.036: 340.4
M-cycles
step  270: timed with pme grid 112 112 160, coulomb cutoff 3.253: 334.3
M-cycles
step  290: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.7
M-cycles
step  310: timed with pme grid 84 84 108, coulomb cutoff 4.392: 348.0
M-cycles
   optimal pme grid 112 112 160, coulomb cutoff 3.253
DD  step 999 load imb.: force 18.4%  pme mesh/force 0.918

At step 1000 the performance loss due to force load imbalance is 6.3 %

NOTE: Turning on dynamic load balancing

Step   

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-20 Thread Jiaqi Lin

Hi Szilard,

- I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is 
better, but in what way?


- I've tired Verlet scheme and it gives small change of cutoff and grid. 
But what I really interested is to manually reproduce the result that 
tune_pme give me in the first case using Group scheme.


- I've also tired lincs-roder=4 and it makes no difference.

- My Fourier spacing is 25% finer, but that shouldn't affect the results 
right? if it do affect the results, then I want to find out how.


- I happen to use the PP/PME rank split (100+28) and it gives me 
interesting results (speed of performance is not bad actually). Then I'm 
very interested in how these cutoff and grid setting can affect my 
simulation results. So I tried to manually control the parameter 
(turning off tune_PME). But no matter how I tired, I can not reproduce 
the result given by tune_pme. So my biggest question is, how does 
tune_PME implemented in the code? What parameters does it actually tuned?


- When PME tuned the cutoff to such large value, the speed does not goes 
down noticeably. So what I suspect is that tune_PME does the direct 
space calculation without changing the neighbor list search distance.


Thank you

Best
Jiaqi

On 1/20/2015 3:54 PM, Szilárd Páll wrote:

Not (all) directly related, but a few comments/questions:

- Have you tried 4.6.7 or 5.0.4?
- Have you considered using the Verlet scheme instead of doing manual buffering?
- lincs-order=8 is very large for 2fs production runs - typically 4 is used.
- Your fourier spacing is a lot (~25%) finer than it needs to be.

- The PP/PME rank split of 100+28 is _very_ inconvenient and it is the
main cause of the horrible PME performance together with the overly
coarse grid. That's why you get such a huge cut-off after the PP-PME
load balancing. Even if you want to stick to these parameters, you
should tune the rank split (manually or with tune_pme).
- The above contributes to the high neighbor search cost too.

--
Szilárd


On Tue, Jan 20, 2015 at 9:18 PM, Jiaqi Lin jq...@mit.edu wrote:

Hi Mark,

Thanks for reply. I put the md.log files in the following link

https://www.dropbox.com/sh/d1d2fbwreizr974/AABYhSRU03nmijbTIXKKr-rra?dl=0

There are four log files
  1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
  2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
  3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
  4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14

Note that the LR Coulombic energy in the first one is almost twice the value
of that in the second one, whereas the grid spacing in both cases are nealy
the same.

Only the first one gives a strong electrostatic interaction of a
nanoparticle with a lipid bilayer under ionic imbalance. In other cases I do
not observe such a strong interaction.

GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks


Regards
Jiaqi




On 1/19/2015 3:22 PM, Mark Abraham wrote:

On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin jq...@mit.edu wrote:


Dear GMX developers,

I've encounter a problem in GROMACS concerning the auto-tuning feature of
PME that bugged me for months. As stated in the title, the auto-tuning
feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
in md.log) when I set -npme to be 28 (128 total CPU cores), and this
giving
me interesting simulation results. When I use -notunepme, I found Coulomb
(SR) and recip. giving me same energy but the actual simulation result is
different. This i can understand: scaling between coulombic cut-off/grid
size theoretically give same accuracy to electrostatics (according to GMX
manual and PME papers), but there actually some numerical error due to
grid
mapping and even if the energy is the same that does not mean system
configuration has to be the same (NVE ensemble: constant energy,
different
configuration).


Total electrostatic energy should be approximately the same with different
PME partitions.



However the thing i don't understand is the following. I am interested in
the result under large coulomb cut-off, so I try to manually set cut-off
and grid space with -notunepme, using the value tuned by mdrun
previously.
This give me complete different simulation result, and the energy is also
different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
(~3.3) still does not give me the result produced by auto-tuning PME.


In what sense is the result different?



In addition, simulation speed dramatically reduces when I set rcoulomb to
be ~3.3 (using -tunepme the speed remains nearly the same no matter how
large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
5.0.1, same thing happens, so clearly it's not because of versions. Thus
the question is: what exactly happened to PME calcualtion using the
auto-tuning feature in mdrun, why it does give different results when I
manually set the coulomb cutoff and grid space to the value tuned by
mdrun
without the 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-20 Thread Szilárd Páll
Not (all) directly related, but a few comments/questions:

- Have you tried 4.6.7 or 5.0.4?
- Have you considered using the Verlet scheme instead of doing manual buffering?
- lincs-order=8 is very large for 2fs production runs - typically 4 is used.
- Your fourier spacing is a lot (~25%) finer than it needs to be.

- The PP/PME rank split of 100+28 is _very_ inconvenient and it is the
main cause of the horrible PME performance together with the overly
coarse grid. That's why you get such a huge cut-off after the PP-PME
load balancing. Even if you want to stick to these parameters, you
should tune the rank split (manually or with tune_pme).
- The above contributes to the high neighbor search cost too.

--
Szilárd


On Tue, Jan 20, 2015 at 9:18 PM, Jiaqi Lin jq...@mit.edu wrote:
 Hi Mark,

 Thanks for reply. I put the md.log files in the following link

 https://www.dropbox.com/sh/d1d2fbwreizr974/AABYhSRU03nmijbTIXKKr-rra?dl=0

 There are four log files
  1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
  2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
  3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
  4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14

 Note that the LR Coulombic energy in the first one is almost twice the value
 of that in the second one, whereas the grid spacing in both cases are nealy
 the same.

 Only the first one gives a strong electrostatic interaction of a
 nanoparticle with a lipid bilayer under ionic imbalance. In other cases I do
 not observe such a strong interaction.

 GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks


 Regards
 Jiaqi




 On 1/19/2015 3:22 PM, Mark Abraham wrote:

 On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin jq...@mit.edu wrote:

 Dear GMX developers,

 I've encounter a problem in GROMACS concerning the auto-tuning feature of
 PME that bugged me for months. As stated in the title, the auto-tuning
 feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
 in md.log) when I set -npme to be 28 (128 total CPU cores), and this
 giving
 me interesting simulation results. When I use -notunepme, I found Coulomb
 (SR) and recip. giving me same energy but the actual simulation result is
 different. This i can understand: scaling between coulombic cut-off/grid
 size theoretically give same accuracy to electrostatics (according to GMX
 manual and PME papers), but there actually some numerical error due to
 grid
 mapping and even if the energy is the same that does not mean system
 configuration has to be the same (NVE ensemble: constant energy,
 different
 configuration).

 Total electrostatic energy should be approximately the same with different
 PME partitions.


 However the thing i don't understand is the following. I am interested in
 the result under large coulomb cut-off, so I try to manually set cut-off
 and grid space with -notunepme, using the value tuned by mdrun
 previously.
 This give me complete different simulation result, and the energy is also
 different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
 (~3.3) still does not give me the result produced by auto-tuning PME.


 In what sense is the result different?


 In addition, simulation speed dramatically reduces when I set rcoulomb to
 be ~3.3 (using -tunepme the speed remains nearly the same no matter how
 large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
 5.0.1, same thing happens, so clearly it's not because of versions. Thus
 the question is: what exactly happened to PME calcualtion using the
 auto-tuning feature in mdrun, why it does give different results when I
 manually set the coulomb cutoff and grid space to the value tuned by
 mdrun
 without the auto-tuning feature (using -notunepme)? Thank you for help.

 For the group scheme, these should all lead to essentially the same result
 and (if tuned) performance. If you can share your various log files on a
 file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5 and
 5.0.1) then we can be in a position to comment further.

 Mark


 additional info: I use Group cutoff-scheme , rvdw is  1.2.


   md.log file:
 DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627

 step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4
 M-cycles
 step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3
 M-cycles
 step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6
 M-cycles
 step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2
 M-cycles
 step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0
 M-cycles
 step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0
 M-cycles
 step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7
 M-cycles
 step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8
 M-cycles
 Grid: 12 x 14 x 14 cells
 step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-20 Thread Mark Abraham
On Tue, Jan 20, 2015 at 9:18 PM, Jiaqi Lin jq...@mit.edu wrote:

 Hi Mark,

 Thanks for reply. I put the md.log files in the following link

 https://www.dropbox.com/sh/d1d2fbwreizr974/AABYhSRU03nmijbTIXKKr-rra?dl=0

 There are four log files
  1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
  2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
  3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
  4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14

 Note that the LR Coulombic energy in the first one is almost twice the
 value of that in the second one, whereas the grid spacing in both cases are
 nealy the same.


Run 1 doesn't have a Coulomb (LR) section, so I don't know what you
compared (runs 2 and 3 have it, because you selected rcoulomb  rlist =
rvdw). The quantities that might be comparable between 1 and 2 are Coul
(recip) and the total Coulomb energy, and they agree pretty well, given
that you didn't actually use the same PME grid nor rcoulomb in 2 that 1
used after tuning. The different rcoulomb gives you a different value of
the Ewald splitting parameter beta, so there's just nothing comparable
here at all. If you want to make direct comparisons... do the same things!

Only the first one gives a strong electrostatic interaction of a
 nanoparticle with a lipid bilayer under ionic imbalance. In other cases I
 do not observe such a strong interaction.


If that's the different result to which you referred previously, then I'm
skeptical whether there's any reason beyond normal variability of chaotic
simulations. Note also that run 2 had no buffering on the LJ, which has an
effect on the ability to compare the overall simulation with that of 1. I
forget whether rcoulomb  rlist  rvdw is permitted in the .mdp file for
the group scheme, though.

Moreover, the use of dispersion correction would be clear cut here, so as a
reviewer, I'd be tempted to throw out any conclusions based on these
simulations ;-) Nanoparticle in lipid bilayer = long-range heterogeneity,
which strongly suggests using LJPME in 5.0, and since you're not dominated
by water, there's no reason at all to use the group scheme.

Mark

GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks


 Regards
 Jiaqi




 On 1/19/2015 3:22 PM, Mark Abraham wrote:

 On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin jq...@mit.edu wrote:

  Dear GMX developers,

 I've encounter a problem in GROMACS concerning the auto-tuning feature of
 PME that bugged me for months. As stated in the title, the auto-tuning
 feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
 in md.log) when I set -npme to be 28 (128 total CPU cores), and this
 giving
 me interesting simulation results. When I use -notunepme, I found Coulomb
 (SR) and recip. giving me same energy but the actual simulation result is
 different. This i can understand: scaling between coulombic cut-off/grid
 size theoretically give same accuracy to electrostatics (according to GMX
 manual and PME papers), but there actually some numerical error due to
 grid
 mapping and even if the energy is the same that does not mean system
 configuration has to be the same (NVE ensemble: constant energy,
 different
 configuration).

  Total electrostatic energy should be approximately the same with
 different
 PME partitions.


  However the thing i don't understand is the following. I am interested in
 the result under large coulomb cut-off, so I try to manually set cut-off
 and grid space with -notunepme, using the value tuned by mdrun
 previously.
 This give me complete different simulation result, and the energy is also
 different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
 (~3.3) still does not give me the result produced by auto-tuning PME.


 In what sense is the result different?


  In addition, simulation speed dramatically reduces when I set rcoulomb to
 be ~3.3 (using -tunepme the speed remains nearly the same no matter how
 large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
 5.0.1, same thing happens, so clearly it's not because of versions. Thus
 the question is: what exactly happened to PME calcualtion using the
 auto-tuning feature in mdrun, why it does give different results when I
 manually set the coulomb cutoff and grid space to the value tuned by
 mdrun
 without the auto-tuning feature (using -notunepme)? Thank you for help.

  For the group scheme, these should all lead to essentially the same
 result
 and (if tuned) performance. If you can share your various log files on a
 file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5 and
 5.0.1) then we can be in a position to comment further.

 Mark


  additional info: I use Group cutoff-scheme , rvdw is  1.2.


   md.log file:
 DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627

 step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4
 M-cycles
 step   50: timed with pme grid 256 256 324, coulomb cutoff 

Re: [gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-19 Thread Mark Abraham
On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin jq...@mit.edu wrote:

 Dear GMX developers,

 I've encounter a problem in GROMACS concerning the auto-tuning feature of
 PME that bugged me for months. As stated in the title, the auto-tuning
 feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
 in md.log) when I set -npme to be 28 (128 total CPU cores), and this giving
 me interesting simulation results. When I use -notunepme, I found Coulomb
 (SR) and recip. giving me same energy but the actual simulation result is
 different. This i can understand: scaling between coulombic cut-off/grid
 size theoretically give same accuracy to electrostatics (according to GMX
 manual and PME papers), but there actually some numerical error due to grid
 mapping and even if the energy is the same that does not mean system
 configuration has to be the same (NVE ensemble: constant energy, different
 configuration).


Total electrostatic energy should be approximately the same with different
PME partitions.


 However the thing i don't understand is the following. I am interested in
 the result under large coulomb cut-off, so I try to manually set cut-off
 and grid space with -notunepme, using the value tuned by mdrun previously.
 This give me complete different simulation result, and the energy is also
 different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
 (~3.3) still does not give me the result produced by auto-tuning PME.


In what sense is the result different?


 In addition, simulation speed dramatically reduces when I set rcoulomb to
 be ~3.3 (using -tunepme the speed remains nearly the same no matter how
 large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
 5.0.1, same thing happens, so clearly it's not because of versions. Thus
 the question is: what exactly happened to PME calcualtion using the
 auto-tuning feature in mdrun, why it does give different results when I
 manually set the coulomb cutoff and grid space to the value tuned by mdrun
 without the auto-tuning feature (using -notunepme)? Thank you for help.


For the group scheme, these should all lead to essentially the same result
and (if tuned) performance. If you can share your various log files on a
file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5 and
5.0.1) then we can be in a position to comment further.

Mark



 additional info: I use Group cutoff-scheme , rvdw is  1.2.


  md.log file:
 DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627

 step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4
 M-cycles
 step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3
 M-cycles
 step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6
 M-cycles
 step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2
 M-cycles
 step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0
 M-cycles
 step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0
 M-cycles
 step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7
 M-cycles
 step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8
 M-cycles
 Grid: 12 x 14 x 14 cells
 step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 346.2
 M-cycles
 step  190: the PME grid restriction limits the PME load balancing to a
 coulomb cut-off of 4.392
 step  210: timed with pme grid 128 128 192, coulomb cutoff 2.846: 360.6
 M-cycles
 step  230: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.6
 M-cycles
 step  250: timed with pme grid 120 120 160, coulomb cutoff 3.036: 340.4
 M-cycles
 step  270: timed with pme grid 112 112 160, coulomb cutoff 3.253: 334.3
 M-cycles
 step  290: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.7
 M-cycles
 step  310: timed with pme grid 84 84 108, coulomb cutoff 4.392: 348.0
 M-cycles
   optimal pme grid 112 112 160, coulomb cutoff 3.253
 DD  step 999 load imb.: force 18.4%  pme mesh/force 0.918

 At step 1000 the performance loss due to force load imbalance is 6.3 %

 NOTE: Turning on dynamic load balancing

Step   Time Lambda
1000   20.00.0

Energies (kJ/mol)
Bond   G96AngleLJ (SR)   Coulomb (SR)   Coul. recip.
 1.98359e+051.79181e+06   -1.08927e+07   -7.04736e+06 -2.32682e+05
  Position Rest.  PotentialKinetic En.   Total Energy Temperature
 6.20627e+04   -1.61205e+074.34624e+06   -1.17743e+07 3.00659e+02
  Pressure (bar)   Constr. rmsd
 2.13582e+001.74243e-04


 Best
 Jiaqi



 --
 Jiaqi Lin
 postdoc fellow
 The Langer Lab

 --
 Gromacs Users mailing list

 * Please search the archive at http://www.gromacs.org/
 Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a 

[gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

2015-01-14 Thread Jiaqi Lin

Dear GMX developers,

I've encounter a problem in GROMACS concerning the auto-tuning feature 
of PME that bugged me for months. As stated in the title, the 
auto-tuning feature of mdrun changed my coulomb cutoff from 1.4 nm to 
~3.3 nm (stated in md.log) when I set -npme to be 28 (128 total CPU 
cores), and this giving me interesting simulation results. When I use 
-notunepme, I found Coulomb (SR) and recip. giving me same energy but 
the actual simulation result is different. This i can understand: 
scaling between coulombic cut-off/grid size theoretically give same 
accuracy to electrostatics (according to GMX manual and PME papers), but 
there actually some numerical error due to grid mapping and even if the 
energy is the same that does not mean system configuration has to be the 
same (NVE ensemble: constant energy, different configuration).
However the thing i don't understand is the following. I am interested 
in the result under large coulomb cut-off, so I try to manually set 
cut-off and grid space with -notunepme, using the value tuned by mdrun 
previously. This give me complete different simulation result, and the 
energy is also different. I've tried to set rlist, rlistlong, or both to 
equal rcoulomb (~3.3) still does not give me the result produced by 
auto-tuning PME. In addition, simulation speed dramatically reduces when 
I set rcoulomb to be ~3.3 (using -tunepme the speed remains nearly the 
same no matter how large the cutoff is tuned to). I've tested this in 
both GMX 4.6.5 and 5.0.1, same thing happens, so clearly it's not 
because of versions. Thus the question is: what exactly happened to PME 
calcualtion using the auto-tuning feature in mdrun, why it does give 
different results when I manually set the coulomb cutoff and grid space 
to the value tuned by mdrun without the auto-tuning feature (using 
-notunepme)? Thank you for help.


additional info: I use Group cutoff-scheme , rvdw is  1.2.


 md.log file:
DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627

step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4 
M-cycles
step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3 
M-cycles
step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6 
M-cycles
step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2 
M-cycles
step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0 
M-cycles
step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0 
M-cycles
step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7 
M-cycles
step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8 
M-cycles

Grid: 12 x 14 x 14 cells
step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 346.2 
M-cycles
step  190: the PME grid restriction limits the PME load balancing to a 
coulomb cut-off of 4.392
step  210: timed with pme grid 128 128 192, coulomb cutoff 2.846: 360.6 
M-cycles
step  230: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.6 
M-cycles
step  250: timed with pme grid 120 120 160, coulomb cutoff 3.036: 340.4 
M-cycles
step  270: timed with pme grid 112 112 160, coulomb cutoff 3.253: 334.3 
M-cycles
step  290: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.7 
M-cycles
step  310: timed with pme grid 84 84 108, coulomb cutoff 4.392: 348.0 
M-cycles

  optimal pme grid 112 112 160, coulomb cutoff 3.253
DD  step 999 load imb.: force 18.4%  pme mesh/force 0.918

At step 1000 the performance loss due to force load imbalance is 6.3 %

NOTE: Turning on dynamic load balancing

   Step   Time Lambda
   1000   20.00.0

   Energies (kJ/mol)
   Bond   G96AngleLJ (SR)   Coulomb (SR)   Coul. recip.
1.98359e+051.79181e+06   -1.08927e+07   -7.04736e+06 -2.32682e+05
 Position Rest.  PotentialKinetic En.   Total Energy Temperature
6.20627e+04   -1.61205e+074.34624e+06   -1.17743e+07 3.00659e+02
 Pressure (bar)   Constr. rmsd
2.13582e+001.74243e-04


Best
Jiaqi



--
Jiaqi Lin
postdoc fellow
The Langer Lab

--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.