On Thu, Jun 1, 2017 at 9:39 PM, Elizabeth Ploetz <plo...@ksu.edu> wrote:

>>> However, if most runs are group scheme, a quick check could show whether
>>> jumps are present in runs that i) do PP-PME tuning ii) if logs go truncated
>>> during continuation at least whether they do use separate PME ranks
>>> (because otherwise CPU-only runs don't tune).
>> i) If grepping "timed" from the LOG file does not give any output, does
>> that mean there was no PP-PME tuning? (Sorry for the stupid question. I'm
>> not sure which piece of information from the LOG file is going to answer
>> whether or not there was PP-PME tuning.)

> Do you run with -append? If so, the log file too gets truncated, but I do
> not recall exactly where and whether the PP-PME balancing messages are
> removed or not, but it's not hard to try -- just run with separate PME and
> too few of them (e.g. 1 out of 12) and that will trigger load balancing.

I run with -noappend, so I think the LOG files are intact. I get load balancing 

> On a second thought, instead of testing with Verlet, you might want to just
> do the above and try to directly observe the anomalies after the balancer.

>> If so, perhaps there is a correlation between having PP-PME tuning and
>> having a jump. Please see this link<http://i1243.photobucket.
>> com/albums/gg545/ploetz/volumeJumps_zps8hmlghtn.png>. *If* the volume for
>> 40-60ns of row 3 is the correct system volume, then all the data in this
>> figure is consistent with there being a jump when there is PP-PME tuning.
>> (Please note that while the data at 1 bar looks okay in this case, and
>> elevated pressures do not, this is not always true. We get jumps at 1 bar
>> as well sometimes.)
>> ii) These are all CPU-only runs. The simulations always use separate PME
>> ranks.
>> Please let me know if any particular data from the LOG file would be
>> helpful.

> It would be easier if you provided logs that we can look through.

Please see three log files here: 
https://drive.google.com/open?id=0BznaVquT5XVyVkNjMHh2eXJ5amc . These 
correspond to the third row (6 kbar) data I just linked to in my previous post 
(directly above, the volumeJumps.png image). The 40-60ns log doesn't have the 
"PP-PME Load Balancing" section, but the other two do.

>>> If I understood correctly, it's only group scheme runs where this has been
>>> observed, so it could be some newer feature/change that interacts badly
>>> with the group scheme.
>> You are correct, so far we have not seen any jumps with Verlet.
>>> BTW, do you have any data with 4.5?
>> I have a few old simulations with version 4.5.3 (none with 4.5, sorry).
>> They were all ran with inexact continuations (i.e., I did not provide
>> checkpoint files when running multiple short runs to create one long
>> simulation) or single trajectories that I had killed at various points and
>> then continued using checkpoint files and -append. I don't have a huge data
>> set with 4.5.3, but none of them exhibited jumps!
>>> I'd suggest that (especially if if investigation of current data does not
>>> reveal the reasons) pick a setup where you seemed to get the anomaly and
>>> run with the same settings using the Verlet scheme lots of short runs with
>>> restarts in a loop.
>> Thanks, we are doing this test.
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Reply via email to