Hi Andreas, Sorry for the delay.
I can confirm the regression. This affects the energy calculation steps where the GPU bonded computational did get significantly slower (as a side-effect of optimizations that mainly targeted the force-only kernels). Can you please file an issue on redmine.gromacs.org and upload the data you shared with me? As a workaround you should consider using an nstcalcenergy>1; bumping it to just ~10 would eliminate most of the regression and would improve the performance of other computation too (the nonbonded F-only kernels are also at least 1.5x faster than the force+energy kernels). Alternatively, I recall you had decent CPU, so you could run the bonded interactions on the CPU Side-note: you are using an overly fine PME grid that you did not scale along the (overly accurate) the rather long cut-offs (see http://manual.gromacs.org/documentation/current/user-guide/mdp-options.html#mdp-fourierspacing ). Cheers, -- Szilárd On Fri, Feb 28, 2020 at 11:10 AM Andreas Baer <[email protected]> wrote: > Hi, > > sorry for it! > > https://faubox.rrze.uni-erlangen.de/getlink/fiUpELsXokQr3a7vyeDSKdY3/benchmarks_2019-2020_all > > Cheers, > Andreas > > On 27.02.20 17:59, Szilárd Páll wrote: > > On Thu, Feb 27, 2020 at 1:08 PM Andreas Baer <[email protected]> wrote: > >> Hi, >> >> On 27.02.20 12:34, Szilárd Páll wrote: >> > Hi >> > >> > On Thu, Feb 27, 2020 at 11:31 AM Andreas Baer <[email protected]> >> wrote: >> > >> >> Hi, >> >> >> >> with the link below, additional log files for runs with 1 GPU should be >> >> accessible now. >> >> >> > I meant to ask you to run single-rank GPU runs, i.e. gmx mdrun -ntmpi 1. >> > >> > It would also help if you could share some input files in case if >> further >> > testing is needed. >> Ok, there is now also an additional benchmark with `-ntmpi 1 -ntomp 4 >> -bonded gpu -update gpu` as parameters. However, it is run on the same >> machine with smt disabled. >> With the following link, I provide all the tests on this machine, I did >> by now, along with a summary of the performance for the several input >> parameters (both in `logfiles`), as well as input files (`C60xh.7z`) and >> the scripts to run these. >> > > Links seems to be missing. > -- > Szilárd > > >> I hope, this helps. If there is anything else, I can do to help, please >> let me know! >> > >> > >> >> Thank you for the comment with the rlist, I did not know, that this >> will >> >> affect the performance negatively. >> > >> > It does in multiple ways. First, you are using a rather long list buffer >> > which will make the nonbonded pair-interaction calculation more >> > computational expensive than it could be if you just used a tolerance >> and >> > let the buffer be calculated. Secondly, as setting a manual rlist >> disables >> > the automated verlet buffer calculation, it prevents mdrun from using a >> > dual pairl-list setup (see >> > >> http://manual.gromacs.org/documentation/2018.1/release-notes/2018/major/features.html#dual-pair-list-buffer-with-dynamic-pruning >> ) >> > which has additional performance benefits. >> Ok, thank you for the explanation! >> > >> > Cheers, >> > -- >> > Szilárd >> Cheers, >> Andreas >> > >> > >> > >> >> I know, about the nstcalcenergy, but >> >> I need it for several of my simulations. >> > Cheers, >> >> Andreas >> >> >> >> On 26.02.20 16:50, Szilárd Páll wrote: >> >>> Hi, >> >>> >> >>> Can you please check the performance when running on a single GPU >> 2019 vs >> >>> 2020 with your inputs? >> >>> >> >>> Also note that you are using some peculiar settings that will have an >> >>> adverse effect on performance (like manually set rlist disallowing the >> >> dual >> >>> pair-list setup, and nstcalcenergy=1). >> >>> >> >>> Cheers, >> >>> >> >>> -- >> >>> Szilárd >> >>> >> >>> >> >>> On Wed, Feb 26, 2020 at 4:11 PM Andreas Baer <[email protected]> >> >> wrote: >> >>>> Hello, >> >>>> >> >>>> here is a link to the logfiles. >> >>>> >> >>>> >> >> >> https://faubox.rrze.uni-erlangen.de/getlink/fiX8wP1LwSBkHRoykw6ksjqY/benchmarks_2019-2020 >> >>>> If necessary, I can also provide some more log or tpr/gro/... files. >> >>>> >> >>>> Cheers, >> >>>> Andreas >> >>>> >> >>>> >> >>>> On 26.02.20 16:09, Paul bauer wrote: >> >>>>> Hello, >> >>>>> >> >>>>> you can't add attachments to the list, please upload the files >> >>>>> somewhere to share them. >> >>>>> This might be quite important to us, because the performance >> >>>>> regression is not expected by us. >> >>>>> >> >>>>> Cheers >> >>>>> >> >>>>> Paul >> >>>>> >> >>>>> On 26/02/2020 15:54, Andreas Baer wrote: >> >>>>>> Hello, >> >>>>>> >> >>>>>> from a set of benchmark tests with large systems using Gromacs >> >>>>>> versions 2019.5 and 2020, I obtained some unexpected results: >> >>>>>> With the same set of parameters and the 2020 version, I obtain a >> >>>>>> performance that is about 2/3 of the 2019.5 version. Interestingly, >> >>>>>> according to nvidia-smi, the GPU usage is about 20% higher for the >> >>>>>> 2020 version. >> >>>>>> Also from the log files it seems, that the 2020 version does the >> >>>>>> computations more efficiently, but spends so much more time >> waiting, >> >>>>>> that the overall performance drops. >> >>>>>> >> >>>>>> Some background info on the benchmarks: >> >>>>>> - System contains about 2.1 million atoms. >> >>>>>> - Hardware: 2x Intel Xeon Gold 6134 („Skylake“) @3.2 GHz = 16 >> cores + >> >>>>>> SMT; 4x NVIDIA Tesla V100 >> >>>>>> (similar results with less significant performance drop (~15%) >> on a >> >>>>>> different machine: 2 or 4 nodes with each [2x Intel Xeon 2660v2 >> („Ivy >> >>>>>> Bridge“) @ 2.2GHz = 20 cores + SMT; 2x NVIDIA Kepler K20]) >> >>>>>> - Several options for -ntmpi, -ntomp, -bonded, -pme are used to >> find >> >>>>>> the optimal set. However the performance drop seems to be >> persistent >> >>>>>> for all such options. >> >>>>>> >> >>>>>> Two representative log files are attached. >> >>>>>> Does anyone have an idea, where this drop comes from, and how to >> >>>>>> choose the parameters for the 2020 version to circumvent this? >> >>>>>> >> >>>>>> Regards, >> >>>>>> Andreas >> >>>>>> >> >>>> -- >> >>>> Gromacs Users mailing list >> >>>> >> >>>> * Please search the archive at >> >>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >> >>>> posting! >> >>>> >> >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >>>> >> >>>> * For (un)subscribe requests visit >> >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users >> or >> >>>> send a mail to [email protected]. >> >> -- >> >> Gromacs Users mailing list >> >> >> >> * Please search the archive at >> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >> >> posting! >> >> >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >> >> >> * For (un)subscribe requests visit >> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or >> >> send a mail to [email protected]. >> >> -- >> Gromacs Users mailing list >> >> * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >> posting! >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >> * For (un)subscribe requests visit >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or >> send a mail to [email protected]. > > > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to [email protected].
