Re: [gmx-users] GROMACS performance issues on POWER9/V100 node

2020-04-24 Thread Szilárd Páll
Hi,

Affinity settings on the Talos II with Ubuntu 18.04 kernel 5.0 works fine.
I get threads pinned where they should be (hwloc confirmed) and consistent
results. I also get reasonable thread placement even without pinning (i.e.
the kernel scatters first until #threads <= #hwthreads). I see only a minor
penalty to not pinning -- not too surprising given that I have a single
NUMA node and the kernel is doing its job.

Here are my quick the test results run on an 8-core Talos II Power9 + a
GPU, using the adh_cubic input:

$ grep Perf *.log
test_1x1_rep1.log:Performance:   16.617
test_1x1_rep2.log:Performance:   16.479
test_1x1_rep3.log:Performance:   16.520
test_1x2_rep1.log:Performance:   32.034
test_1x2_rep2.log:Performance:   32.389
test_1x2_rep3.log:Performance:   32.340
test_1x4_rep1.log:Performance:   62.341
test_1x4_rep2.log:Performance:   62.569
test_1x4_rep3.log:Performance:   62.476
test_1x8_rep1.log:Performance:   97.049
test_1x8_rep2.log:Performance:   96.653
test_1x8_rep3.log:Performance:   96.889


This seems to point towards some issue with the OS or setup on the IBM
machines you have -- and the unit test error may be one of the symptoms of
it (as it suggests something is off with the hardware topology and a NUMA
node is missing from it). I'd still suggest checking if a full not
allocation with all threads, memory, etc passed to the job results in
successful affinity settings i) in mdrun ii) in some other tool.

Please update this thread if you have further findings.

Cheers,
--
Szilárd


On Fri, Apr 24, 2020 at 10:52 PM Szilárd Páll 
wrote:

>
> The following lines are found in md.log for the POWER9/V100 run:
>>
>> Overriding thread affinity set outside gmx mdrun
>> Pinning threads with an auto-selected logical core stride of 128
>> NOTE: Thread affinity was not set.
>>
>> The full md.log is available here:
>> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log
>
>
> I glanced over that at first, will see if I can reproduce it, though I
> only have access to a Raptor Talos, not an IBM machine with Ubuntu.
>
> What OS are you using?
>
>
> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-requ...@gromacs.org.
>>
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GROMACS performance issues on POWER9/V100 node

2020-04-24 Thread Szilárd Páll
> The following lines are found in md.log for the POWER9/V100 run:
>
> Overriding thread affinity set outside gmx mdrun
> Pinning threads with an auto-selected logical core stride of 128
> NOTE: Thread affinity was not set.
>
> The full md.log is available here:
> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log


I glanced over that at first, will see if I can reproduce it, though I only
have access to a Raptor Talos, not an IBM machine with Ubuntu.

What OS are you using?


-- 
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GROMACS performance issues on POWER9/V100 node

2020-04-24 Thread Szilárd Páll
On Fri, Apr 24, 2020 at 5:55 AM Alex  wrote:

> Hi Kevin,
>
> We've been having issues with Power9/V100 very similar to what Jon
> described and basically settled on what I believe is sub-par
> performance. We tested it on systems with ~30-50K particles and threads
> simply cannot be pinned.


What does that mean, how did you verify that?
The Linux kernel can in general set affinities on ppc64el, whether that's
requested by mdrun or some other tool, so if you have observed that the
affinity mask is not respected (or it does not change), that more likely OS
/ setup issue, I'd think.

What is different compared to x86 is that the hardware thread layout is
different on Power9 (with default Linux kernel configs) and hardware
threads are exposed as consecutive "CPUs" by the OS rather than strided by
#cores.

I could try to sum up some details on how to sett affinities (with mdrun or
external tools), if that is of interest. However, it really should be
something that's possible to do even using the job scheduler (+ along
reasonable system configuration).


> As far as Gromacs is concerned, our brand-new
> Power9 nodes operate as if they were based on Intel CPUs (two threads
> per core)


Unless the hardware thread layout has been changed, that's perhaps not the
case, see above.


> and zero advantage of IBM parallelization is being taken.
>

You mean the SMT4?


> Other users of the same nodes reported similar issues with other
> software, which to me suggests that our sysadmins don't really know how
> to set these nodes up.
>
> At this point, if someone could figure out a clear set of build
> instructions in combination with slurm/mdrun inputs, it would be very
> much appreciated.
>

Have you checked  public documentation on ORNL's sites? GROMACS has been
used successfully on Summit. What about IBM support?

--
Szilárd


>
> Alex
>
> On 4/23/2020 9:37 PM, Kevin Boyd wrote:
> > I'm not entirely sure how thread-pinning plays with slurm allocations on
> > partial nodes. I always reserve the entire node when I use thread
> pinning,
> > and run a bunch of simulations by pinning to different cores manually,
> > rather than relying on slurm to divvy up resources for multiple jobs.
> >
> > Looking at both logs now, a few more points
> >
> > * Your benchmarks are short enough that little things like cores spinning
> > up frequencies can matter. I suggest running longer (increase nsteps in
> the
> > mdp or at the command line), and throwing away your initial benchmark
> data
> > (see -resetstep and -resethway) to avoid artifacts
> > * Your benchmark system is quite small for such a powerful GPU. I might
> > expect better performance running multiple simulations per-GPU if the
> > workflows being run can rely on replicates, and a larger system would
> > probably scale better to the V100.
> > * The P100/intel system appears to have pinned cores properly, it's
> > unclear whether it had a real impact on these benchmarks
> > * It looks like the CPU-based computations were the primary contributors
> to
> > the observed difference in performance. That should decrease or go away
> > with increased core counts and shifting the update phase to the GPU. It
> may
> > be (I have no prior experience to indicate either way) that the intel
> cores
> > are simply better on a 1-1 basis than the Power cores. If you have 4-8
> > cores per simulation (try -ntomp 4 and increasing the allocation of your
> > slurm job), the individual core performance shouldn't matter too
> > much, you're just certainly bottlenecked on one CPU core per GPU, which
> can
> > emphasize performance differences..
> >
> > Kevin
> >
> > On Thu, Apr 23, 2020 at 6:43 PM Jonathan D. Halverson <
> > halver...@princeton.edu> wrote:
> >
> >> *Message sent from a system outside of UConn.*
> >>
> >>
> >> Hi Kevin,
> >>
> >> md.log for the Intel run is here:
> >>
> >>
> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log.intel-broadwell-P100
> >>
> >> Thanks for the info on constraints with 2020. I'll try some runs with
> >> different values of -pinoffset for 2019.6.
> >>
> >> I know a group at NIST is having the same or similar problems with
> >> POWER9/V100.
> >>
> >> Jon
> >> 
> >> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> >> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> on behalf of Kevin
> >> Boyd 
> >> Sent: Thursday, April 23, 2020 9:08 PM
> >> To: gmx-us...@gromacs.org 
> >> Subject: Re: [gmx-users] GROMACS performance issues on POWER9/V100 node
> >>
> >> Hi,
> >>
> >> Can you post the full log for the Intel system? I typically find the
> real
> >> cycle and time accounting section a better place to start debugging
> >> performance issues.
> >>
> >> A couple quick notes, but need a side-by-side comparison for more useful
> >> analysis, and these points may apply to both systems so may not be your
> >> root cause:
> >> * At first glance, your Power system spends 1/3 of its time in
> constraint

Re: [gmx-users] GROMACS performance issues on POWER9/V100 node

2020-04-24 Thread Szilárd Páll
Using a single thread per GPU as the linked log files show is not
sufficient for GROMACS (and any modern machine should have more than that
anyway), but I imply from your mail that this only meant to debug
performance instability?

Your performance variations with Power9 may be related that you are either
not setting affinities or the affinity settings is not correct. However,
you also have some job scheduler in the way (that I suspect is either not
configured well or is not passed the required options to correctly assign
resources to jobs) and obfuscates machine layout and makes things look
weird to mdrun [1].

I suggest to simplify the problem and try to debug it step-by-step. Start
with allocating full nodes and test that you can pin (either with mdurun
-pin on or hwloc) and avoid [1], get an understanding of what should you
expect from the node sharing that seem to not work correctly. Building
GROMACS with hwloc may help as you get better reporting in the log.

[1]
https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log.intel-broadwell-P100#L58

--
Szilárd


On Fri, Apr 24, 2020 at 3:43 AM Jonathan D. Halverson <
halver...@princeton.edu> wrote:

> Hi Kevin,
>
> md.log for the Intel run is here:
>
> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log.intel-broadwell-P100
>
> Thanks for the info on constraints with 2020. I'll try some runs with
> different values of -pinoffset for 2019.6.
>
> I know a group at NIST is having the same or similar problems with
> POWER9/V100.
>
> Jon
> 
> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> on behalf of Kevin
> Boyd 
> Sent: Thursday, April 23, 2020 9:08 PM
> To: gmx-us...@gromacs.org 
> Subject: Re: [gmx-users] GROMACS performance issues on POWER9/V100 node
>
> Hi,
>
> Can you post the full log for the Intel system? I typically find the real
> cycle and time accounting section a better place to start debugging
> performance issues.
>
> A couple quick notes, but need a side-by-side comparison for more useful
> analysis, and these points may apply to both systems so may not be your
> root cause:
> * At first glance, your Power system spends 1/3 of its time in constraint
> calculation, which is unusual. This can be reduced 2 ways - first, by
> adding more CPU cores. It doesn't make a ton of sense to benchmark on one
> core if your applications will use more. Second, if you upgrade to Gromacs
> 2020 you can probably put the constraint calculation on the GPU with
> -update GPU.
> * The Power system log has this line:
>
> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log#L304
> indicating
> that threads perhaps were not actually pinned. Try adding -pinoffset 0 (or
> some other core) to specify where you want the process pinned.
>
> Kevin
>
> On Thu, Apr 23, 2020 at 9:40 AM Jonathan D. Halverson <
> halver...@princeton.edu> wrote:
>
> > *Message sent from a system outside of UConn.*
> >
> >
> > We are finding that GROMACS (2018.x, 2019.x, 2020.x) performs worse on an
> > IBM POWER9/V100 node versus an Intel Broadwell/P100. Both are running
> RHEL
> > 7.7 and Slurm 19.05.5. We have no concerns about GROMACS on our Intel
> > nodes. Everything below is about of the POWER9/V100 node.
> >
> > We ran the RNASE benchmark with 2019.6 with PME and cubic box using 1
> > CPU-core and 1 GPU (
> > ftp://ftp.gromacs.org/pub/benchmarks/rnase_bench_systems.tar.gz) and
> > found that the Broadwell/P100 gives 144 ns/day while POWER9/V100 gives
> 102
> > ns/day. The difference in performance is roughly the same for the larger
> > ADH benchmark and when different numbers of CPU-cores are used. GROMACS
> is
> > always underperforming on our POWER9/V100 nodes. We have pinning turned
> on
> > (see Slurm script at bottom).
> >
> > Below is our build procedure on the POWER9/V100 node:
> >
> > version_gmx=2019.6
> > wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-${version_gmx}.tar.gz
> > tar zxvf gromacs-${version_gmx}.tar.gz
> > cd gromacs-${version_gmx}
> > mkdir build && cd build
> >
> > module purge
> > module load rh/devtoolset/7
> > module load cudatoolkit/10.2
> >
> > OPTFLAGS="-Ofast -mcpu=power9 -mtune=power9 -mvsx -DNDEBUG"
> >
> > cmake3 .. -DCMAKE_BUILD_TYPE=Release \
> > -DCMAKE_C_COMPILER=gcc -DCMAKE_C_FLAGS_RELEASE="$OPTFLAGS" \
> > -DCMAKE_CXX_COMPILER=g++ -DCMAKE_CXX_FLAGS_RELEASE="$OPTFLAGS" \
> > -DGMX_BUILD_MDRUN_ONLY=OFF -DGMX_MPI=OFF -DGMX_OPENMP=ON \
> > -DGMX_SIMD=IBM_VSX -DGMX_DOUBLE=OFF \
> > -DGMX_BUILD_OWN_FFTW=ON \
> > -DGMX_GPU=ON -DGMX_CUDA_TARGET_SM=70 \
> > -DGMX_OPENMP_MAX_THREADS=128 \
> > -DCMAKE_INSTALL_PREFIX=$HOME/.local \
> > -DGMX_COOL_QUOTES=OFF -DREGRESSIONTEST_DOWNLOAD=ON
> >
> > make -j 10
> > make check
> > make install
> >
> > 45 of the 46 tests pass with the exception being HardwareUnitTests. There
> > are several posts about this and apparently it is not a concern. The full
> > build log is 

Re: [gmx-users] Spec'ing for new machines (again!)

2020-04-21 Thread Szilárd Páll
Hi,

Note that the new generation Ryzen2-based CPUs perform even better than
those we benchmarked for that paper. The 3900-series Threarippers are great
for workstations, unless you need the workstation form-factor, you are
better off with servers like the TYAN GA88B8021. If so, i  EPYC 1P is what
you should be looking at, e.g. EPYC 7402P.

GPU-wise, depending on your timeline you can expect NVIDIA to release
something this year, so you may want to time the purchase to either hit the
curren-gen GPUs discounted or go for the next gen.

Cheers,
--
Szilárd


On Sat, Apr 18, 2020 at 7:56 AM Alex  wrote:

>  From what I am seeing in this paper, we should just go with something
> along the lines of Ryzen 1950x + 4 x 2080 or 2080Ti. There are
> indications that Ryzen also rips (pun intended) Xeons in DFT/CC
> calculations, so overall this combination sounds quite reasonable for
> our purposes. This also outlines a path towards upgrading our existing
> nodes.
>
> Thanks Kevin, this is very informative.
>
> Alex
>
> On 4/17/2020 8:52 PM, Kevin Boyd wrote:
> > Yes, that's it. I think consumer-class Nvidia cards are still the best
> > value, unless you have other applications that need the benefits of
> > industrial cards.
> >
> > Cheers,
> >
> > Kevin
> >
> > On Fri, Apr 17, 2020 at 5:58 PM Alex  wrote:
> >
> >> *Message sent from a system outside of UConn.*
> >>
> >>
> >> Hi Kevin,
> >>
> >> Thanks for responding! RAM is mentioned for completeness. This amount is
> >> not for gmx -- we do other, much more RAM-intensive calculations, in
> >> fact 256 gigs is very modest for that. I have no doubt that AMD CPUs
> >> would work with Nvidia GPUs, was mainly wondering if there would be any
> >> additional speed boost if we also used AMD GPUs.
> >>
> >> Nope, haven't seen the paper, but quite interested in checking it out.
> >> Is this the latest version?
> >> https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.26011
> >>
> >> Thank you,
> >>
> >> Alex
> >>
> >> On 4/17/2020 6:29 PM, Kevin Boyd wrote:
> >>> Hi,
> >>>
> >>> AMD CPUs work fine with Nvidia GPUs, so feel free to use AMD as a base
> >>> regardless of the GPUs you end up choosing. In my experience AMD CPUs
> >> have
> >>> had great value.
> >>>
> >>> A ratio of ~4 cores/ GPU shouldn't be a problem. 256 GB of RAM is very
> >> much
> >>> overkill, but perhaps you have other uses for the machine that need it.
> >>>
> >>> Have you seen the updated "More bang for your buck" paper that goes
> into
> >>> optimizing compute nodes?
> >>>
> >>>
> >>> See
> >>>
> >>>
> >>>
> >>> On Fri, Apr 17, 2020 at 4:17 PM FAISAL NABI  wrote:
> >>>
>  *Message sent from a system outside of UConn.*
> 
> 
>  Gromacs is totally compatible with nvidia based gpu. You need to
> install
>  cuda drivers and you can build easily with cmake. For amd gpu you
> would
> >> be
>  needing openCL alongwith the sdk for amd gpu. I would suggest you to
> use
>  nvidia acceleration for better performance.
> 
>  On Sat, Apr 18, 2020, 4:42 AM Alex  wrote:
> 
> > Hello all,
> >
> > Hope you're staying safe & healthy! We are starting to spec new
> >> machines
> > and our end goal is two machines, each featuring:
> >
> > 1. ~16-18 CPU cores w/hyperthreading
> >
> > 2. Four GPUs
> >
> > 3. ~256 gigs of RAM.
> >
> > A very approximate allocation of money is ~$15-20K per unit, but we
> > could of course buy more if each machine turns out to be
> significantly
> > cheaper. All suggestions for CPUs and GPUs (esp. from Szilard) are
> > welcome. We are somewhat open to AMD-based solutions, but wonder what
> > the situation is with GPU acceleration, as so far we've been entirely
> > Intel-based. Will it work with NVIDIA cards? Will we have to install
> >> AMD
> > GPUs? Does current Gromacs perform well on AMD-based rigs?
> >
> > Thank you!
> >
> > Alex
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > send a mail to gmx-users-requ...@gromacs.org.
> >
>  --
>  Gromacs Users mailing list
> 
>  * Please search the archive at
>  http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>  posting!
> 
>  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
>  * For (un)subscribe requests visit
>  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>  send a mail to gmx-users-requ...@gromacs.org.
> 
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List 

Re: [gmx-users] Disabling MKL

2020-04-21 Thread Szilárd Páll
Configure with -DGMX_EXTERNAL_BLAS=OFF -DGMX_EXTERNAL_LAPACK=OFF

Cheers,
--
Szilárd


On Fri, Apr 17, 2020 at 2:07 PM Mahmood Naderan 
wrote:

> Hi
> How can I disable MKL while building gromacs? With this configure command
>
> cmake .. -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=on  -DGMX_FFT_LIBRARY=fftw3
>
>
>
> I see
>
> -- The GROMACS-managed build of FFTW 3 will configure with the following
> optimizations: --enable-sse2;--enable-avx;--enable-avx2
> -- Using external FFT library - FFTW3 build managed by GROMACS
> -- Looking for sgemm_
> -- Looking for sgemm_ - not found
> -- Looking for sgemm_
> -- Looking for sgemm_ - found
> -- Found BLAS:
> /share/binary/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so;/share/binary/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_thread.so;/share/binary/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_core.so;/opt/intel/lib/intel64/libguide.so;-lpthread;-lm;-ldl
>
>
>
>
>
> Then I get these errors
>
> [100%] Linking CXX executable ../../bin/gmx
> [100%] Linking CXX executable ../../bin/template
> /bin/ld: warning: libmkl_intel_lp64.so, needed by
> ../../lib/libgromacs.so.4.0.0, not found (try using -rpath or -rpath-link)
> /bin/ld: warning: libmkl_intel_thread.so, needed by
> ../../lib/libgromacs.so.4.0.0, not found (try using -rpath or -rpath-link)
> /bin/ld: warning: libmkl_core.so, needed by ../../lib/libgromacs.so.4.0.0,
> not found (try using -rpath or -rpath-link)
> /bin/ld: warning: libguide.so, needed by ../../lib/libgromacs.so.4.0.0,
> not found (try using -rpath or -rpath-link)
> ../../lib/libgromacs.so.4.0.0: undefined reference to `ssteqr_'
> ../../lib/libgromacs.so.4.0.0: undefined reference to `dsteqr_'
> ../../lib/libgromacs.so.4.0.0: undefined reference to `sger_'
>
>
>
>
>
>
> Regards,
> Mahmood
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Multi-GPU optimization, "DD without halo exchange is not supported"

2020-04-06 Thread Szilárd Páll
On Fri, Mar 27, 2020 at 8:30 PM Leandro Bortot 
wrote:

> Dear users,
>
>  I'm trying to optimize the execution of a system composed by 10
> million atoms on a multi-GPU machine with GROMACS 2020.1.
>  I've followed the instructions given at
>
> https://devblogs.nvidia.com/creating-faster-molecular-dynamics-simulations-with-gromacs-2020/
> . However, when I run my simulation, mdrun tells me this:
>
> "
>
> *Update task on the GPU was required, by the GMX_FORCE_UPDATE_DEFAULT_GPU
> environment variable, but the following condition(s) were not
> satisfied:Domain decomposition without GPU halo exchange is not supported.*
> "
>
>  My understanding was that exporting *GMX_GPU_DD_COMMS=true* would
> enable such halo communications.
>  My simulation runs, however the performance is not scaling well with
> the number of GPUs.
>
>  I've done the following:
> "
>
> *export GMX_GPU_DD_COMMS=trueexport GMX_GPU_PME_PP_COMMS=trueexport
> GMX_FORCE_UPDATE_DEFAULT_GPU=true*"
>

You are getting the error below because you did not export all three
variables there. Those exports are separate commands and need to be issued
with some separator, e.g. semicolon or newline.


>  And my execution line is:
> "*mpirun -np 4 gmx_mpi mdrun -s eq.1.tpr -v -deffnm eq.1 -pin on -ntomp 6
> -npme 1 -nb gpu -bonded gpu -pme gpu -nstlist 400*"
>

Beware that this command line is specific for one use-case presented in
your source (i.e. that very hardware and input system) and may not be fully
transferable (e.g. "-ntomp 6" or "-nstlist 400").

Cheers,
--
Szilárd


>
>  If I add "-update gpu" to this same line I have the following error:
> "
>
>
>
>
> *Inconsistency in user input:Update task on the GPU was required,but the
> following condition(s) were not satisfied:Domain decomposition without GPU
> halo exchange is not supported. With separate PME rank(s), PME must use
> direct communication.*"
>
>  Also, I'm using constraints = h-bonds in my .mdp file.
>
>  Am I doing something wrong?
>
> Thank you for your attention,
> Leandro
> ---
> Leandro Oliveira Bortot
> Postdoctoral Fellow
>  https://www.linkedin.com/in/leandro-obt/
> Laboratory of Computational Biology
> Brazilian Biosciences National Laboratory (LNBio)
> Brazilian Center for Research in Energy and Materials (CNPEM)
> Zip Code 13083-970, Campinas, São Paulo, Brazil.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Unable to compile GROMACS 2020.1 using GNU 7.5.0

2020-04-06 Thread Szilárd Páll
On Sat, Apr 4, 2020 at 10:41 PM Wei-Tse Hsu  wrote:

> Dear gmx users,
> Recently I've been trying to install GROMACS 2020.1. However, I encounter a
> compilation error while using the make command. The error is as follows:
>
>
>
>
>
> */usr/bin/ld: cannot find /lib/libpthread.so.0/usr/bin/ld: cannot find
> /usr/lib/libpthread_nonshared.acollect2: error: ld returned 1 exit
> statusmake[2]: *** [lib/libgromacs.so.5.0.0] Error 1make[1]: ***
> [src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2make: *** [all] Error 2*
> From the error above, it seemed that GROMACS was unable to find library
> like libpthread.so.0 and libpthread_nonshared.a. After checking I found
> that instead of in /lib/ and /usr/lib/, the files libpthread.so.0 and
> libpthread_nonshared.a are in /lib/x86_64-linux-gnu and /usr/lib/
> x86_64-linux-gnu. Therefore, I used the option DCMAKE_PREFIX_PATH to add
> the paths of the libraries for GROMACS to search for. Specifically, the
> command I was using is:
> *tar xfz gromacs-2020.1.tar.gz && cd gromacs-2020.1 && mkdir build && cd
> build && rm -rf * && cmake .. -DREGRESSIONTEST_DOWNLOAD=ON
> -DCMAKE_CXX_COMPILER=g++-7 -DCMAKE_C_COMPILER=gcc-7
> -DCMAKE_PREFIX_PATH=/usr/lib/x86_64-linux-gnu:/lib/x86_64-linux-gnu && make
>

As Kevin noted, the separator in the prefix path should be ";".

However, it seem unnecessary to pass system libraries as custom paths to
cmake. Try first without those.

--
Szilárd



> > gmx_compile.log*
> However, it turned out that I still got the same error. I'm confused right
> now since I thought that gromacs should be able to find the files. I'm
> wondering if I missed something. Could someone please tell me what I can do
> or share some relevant experience? Any help is much appreciated!
>
> Best,
> Wei-Tse
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] replica exchange simulations performance issues.

2020-03-31 Thread Szilárd Páll
On Tue, Mar 31, 2020 at 1:45 AM Miro Astore  wrote:

> I got up to 25-26 ns/day with my 4 replica system  (same logic scaled
> up to 73 replicas) which I think is reasonable. Could I do better?
>

Hard to say without complete log file. Please share single run and multi
run log files.


>
> mpirun -np 48 gmx_mpi mdrun  -ntomp 1 -v -deffnm memb_prod1 -multidir
> 1 2 3 4 -replex 1000
>
>  I have tried following the manual but I don't think i'm going it
> right I keep getting errors. If you have a minute to suggest how I
> could do this I would appreciate that.
>

Again, the exact error messages and associated command line/log are
necessary to be able to give further suggestions.

--
Szilárd


>
> log file accounting:
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
> On 12 MPI ranks Computing: Num Num Call Wall time Giga-Cycles Ranks
> Threads Count (s) total sum %
>
> -
> Domain decomp. 12 1 26702 251.490 8731.137 1.5
> DD comm. load 12 1 25740 1.210 42.003 0.0 DD
> comm. bounds 12 1 26396 9.627 334.238 0.1
> Neighbor search 12 1 25862 283.564 9844.652 1.7
> Launch GPU ops. 12 1 5004002 343.309 11918.867 2.0
> Comm. coord. 12 1 2476139 508.526 17654.811 3.0 Force 12 1 2502001
> 419.341 14558.495 2.5
> Wait + Comm. F 12 1 2502001 347.752 12073.100 2.1
> PME mesh 12 1 2502001 11721.893 406955.915 69.2
> Wait Bonded GPU 12 1 2503 0.008 0.285 0.0
> Wait GPU NB nonloc. 12 1 2502001 48.918 1698.317 0.3
> Wait GPU NB local 12 1 2502001 19.475 676.141 0.1
> NB X/F buffer ops. 12 1 9956280 753.489 26159.337 4.5
> Write traj. 12 1 519 1.078 37.427 0.0 Update 12 1 2502001 434.272
> 15076.886 2.6
> Constraints 12 1 2502001 701.800 24364.800 4.1
> Comm. energies 12 1 125942 36.574 1269.776 0.2
> Rest 1047.855 36378.988 6.2
>
> -
> Total 16930.182 587775.176 100.0
>
> -
> Breakdown of PME mesh computation
>
> -
> PME redist. X/F 12 1 5004002 1650.247 57292.604 9.7
> PME spread 12 1 2502001 4133.126 143492.183 24.4
> PME gather 12 1 2502001 2303.327 79965.968 13.6
> PME 3D-FFT 12 1 5004002 2119.410 73580.828 12.5
> PME 3D-FFT Comm. 12 1 5004002 918.318 31881.804 5.4
> PME solve Elec 12 1 2502001 584.446 20290.548 3.5
>
>  -
>
> Best, Miro
>
> On Tue, Mar 31, 2020 at 9:58 AM Szilárd Páll 
> wrote:
> >
> > On Sun, Mar 29, 2020 at 3:56 AM Miro Astore 
> wrote:
> >
> > > Hi everybody. I've been experimenting with REMD for my system running
> > > on 48 cores with 4 gpus (I will need to scale up to 73 replicas
> > > because this is a complicated system with many DOF I'm open to being
> > > told this is all a silly idea).
> > >
> >
> > It is a bad idea, you should have at least 1 physical core per replica
> and
> > with a large system ideally more.
> > However, if you are going for high efficiency (aggregate ns/day per
> phyical
> > node), always put at least 2 replicas per GPU.
> >
> >
> > >
> > > My run configuration is
> > > mpirun -np 4 --map-by numa gmx_mpi mdrun -cpi memb_prod1.cpt -ntomp 11
> > > -v -deffnm memb_prod1 -multidir 1 2 3 4 -replex 1000
> > >
> > > the best I can squeeze out of this is 9ns/day. In a non-replica
> > > simulation I can hit 50ns/day with a single GPU and 12 cores.
> > >
> >
> > That is abnormal and indicates that:
> > - either something is wrong with the hardware mapping / assignment in
> your
> > run or; do use simply "-pin on" and let mdrun manage threads pinning
> (that
> > map-by-numa is certainly not optimal); also I advise against tweaking the
> > thread count and using weird numbers like 11 (just use quarter);
> > - your exchange overhead is very high (check the communication cost in
> the
> > log)
> >
> > If you share some log files of a standalone and a replex run, we can
> advise
> > where the performance loss comes from.
> >
> > Cheers,
> > --
> > Szilárd
> >
> > Looking at my accounting, for a single replica 52% of time is being
> > > spent on the "Force" category with 92% of my Mflops going into NxN
> > > Ewald Elec. + LJ [F]
> > >
> >
> > > I'm wondering what I could do to reduce this bottle neck if anything.
> > >
> > > Thank you.
> > > --
> &g

Re: [gmx-users] replica exchange simulations performance issues.

2020-03-30 Thread Szilárd Páll
On Sun, Mar 29, 2020 at 3:56 AM Miro Astore  wrote:

> Hi everybody. I've been experimenting with REMD for my system running
> on 48 cores with 4 gpus (I will need to scale up to 73 replicas
> because this is a complicated system with many DOF I'm open to being
> told this is all a silly idea).
>

It is a bad idea, you should have at least 1 physical core per replica and
with a large system ideally more.
However, if you are going for high efficiency (aggregate ns/day per phyical
node), always put at least 2 replicas per GPU.


>
> My run configuration is
> mpirun -np 4 --map-by numa gmx_mpi mdrun -cpi memb_prod1.cpt -ntomp 11
> -v -deffnm memb_prod1 -multidir 1 2 3 4 -replex 1000
>
> the best I can squeeze out of this is 9ns/day. In a non-replica
> simulation I can hit 50ns/day with a single GPU and 12 cores.
>

That is abnormal and indicates that:
- either something is wrong with the hardware mapping / assignment in your
run or; do use simply "-pin on" and let mdrun manage threads pinning (that
map-by-numa is certainly not optimal); also I advise against tweaking the
thread count and using weird numbers like 11 (just use quarter);
- your exchange overhead is very high (check the communication cost in the
log)

If you share some log files of a standalone and a replex run, we can advise
where the performance loss comes from.

Cheers,
--
Szilárd

Looking at my accounting, for a single replica 52% of time is being
> spent on the "Force" category with 92% of my Mflops going into NxN
> Ewald Elec. + LJ [F]
>

> I'm wondering what I could do to reduce this bottle neck if anything.
>
> Thank you.
> --
> Miro A. Astore   (he/him)
> PhD Candidate | Computational Biophysics
> Office 434 A28 School of Physics
> University of Sydney
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] [gmx-developers] The setup of gpu_id has a bug?

2020-03-10 Thread Szilárd Páll
Hi,

Please sue the user's mailing list for questions not related to GROMACS
development.

By default. the "-gpu_id" option takes a sequence of digits corresponding
to the numeric identifiers of GPUs. In cases where there are >10 GPUs in a
system, a comma-separated string should be used, see
http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html#running-mdrun-within-a-single-node

Cheers,
--
Szilárd


On Tue, Mar 10, 2020 at 5:51 PM dxli75  wrote:

> everyone,
>When I set the gpu_id as 10-15, the GMX always give the message of 
> PP:1,PME:1.
> And the job run on GPU No.1 instead of No.10-15.
>Is there a bug?
>
> GROMACS:  gmx mdrun, version 2020.1
>
> Executable:
> /raid/data/dxli/projects/ncovs-1rpb/../../gromacs-2020.1/bin/gmx
> Data prefix:  /raid/data/dxli/projects/ncovs-1rpb/../../gromacs-2020.1
> Working dir:  /raid/data/dxli/projects/ncovs-1rpb
> Command line:
>   gmx mdrun -deffnm test -gpu_id 10 -ntmpi 1 -ntomp 8
>
> Reading file test.tpr, VERSION 2020 (single precision)
> 1 GPU selected for this run.
> Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
>   PP:1,PME:1
> PP tasks will do (non-perturbed) short-ranged interactions on the GPU
> PP task will update and constrain coordinates on the CPUPME tasks will do
> all aspects on the GPU
> Using 1 MPI thread
>
> Non-default thread affinity set, disabling internal thread affinity
>
> Using 8 OpenMP threads
>
>
>
>
>
>
> --
> *Daixi Li  *
> *Ph.D., Associated Professor**P.I.*, Laboratory of Computational Biology
>
>
>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Performance issues with Gromacs 2020 on GPUs - slower than 2019.5

2020-03-09 Thread Szilárd Páll
Hi Andreas,

Sorry for the delay.

I can confirm the regression. This affects the energy calculation steps
where the GPU bonded computational did get significantly slower (as a
side-effect of optimizations that mainly targeted the force-only kernels).

Can you please file an issue on redmine.gromacs.org and upload the data you
shared with me?

As a workaround you should consider using an nstcalcenergy>1; bumping it to
just ~10 would eliminate most of the regression and would improve the
performance of other computation too (the nonbonded F-only kernels are also
at least 1.5x faster than the force+energy kernels).
Alternatively, I recall you had decent CPU, so you could run the bonded
interactions on the CPU

Side-note: you are using an overly fine PME grid that you did not scale
along the (overly accurate) the rather long cut-offs (see
http://manual.gromacs.org/documentation/current/user-guide/mdp-options.html#mdp-fourierspacing
).

Cheers,
--
Szilárd


On Fri, Feb 28, 2020 at 11:10 AM Andreas Baer  wrote:

> Hi,
>
> sorry for it!
>
> https://faubox.rrze.uni-erlangen.de/getlink/fiUpELsXokQr3a7vyeDSKdY3/benchmarks_2019-2020_all
>
> Cheers,
> Andreas
>
> On 27.02.20 17:59, Szilárd Páll wrote:
>
> On Thu, Feb 27, 2020 at 1:08 PM Andreas Baer  wrote:
>
>> Hi,
>>
>> On 27.02.20 12:34, Szilárd Páll wrote:
>> > Hi
>> >
>> > On Thu, Feb 27, 2020 at 11:31 AM Andreas Baer 
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> with the link below, additional log files for runs with 1 GPU should be
>> >> accessible now.
>> >>
>> > I meant to ask you to run single-rank GPU runs, i.e. gmx mdrun -ntmpi 1.
>> >
>> > It would also help if you could share some input files in case if
>> further
>> > testing is needed.
>> Ok, there is now also an additional benchmark with `-ntmpi 1 -ntomp 4
>> -bonded gpu -update gpu` as parameters. However, it is run on the same
>> machine with smt disabled.
>> With the following link, I provide all the tests on this machine, I did
>> by now, along with a summary of the performance for the several input
>> parameters (both in `logfiles`), as well as input files (`C60xh.7z`) and
>> the scripts to run these.
>>
>
> Links seems to be missing.
> --
> Szilárd
>
>
>> I hope, this helps. If there is anything else, I can do to help, please
>> let me know!
>> >
>> >
>> >> Thank you for the comment with the rlist, I did not know, that this
>> will
>> >> affect the performance negatively.
>> >
>> > It does in multiple ways. First, you are using a rather long list buffer
>> > which will make the nonbonded pair-interaction calculation more
>> > computational expensive than it could be if you just used a tolerance
>> and
>> > let the buffer be calculated. Secondly, as setting a manual rlist
>> disables
>> > the automated verlet buffer calculation, it prevents mdrun from using a
>> > dual pairl-list setup (see
>> >
>> http://manual.gromacs.org/documentation/2018.1/release-notes/2018/major/features.html#dual-pair-list-buffer-with-dynamic-pruning
>> )
>> > which has additional performance benefits.
>> Ok, thank you for the explanation!
>> >
>> > Cheers,
>> > --
>> > Szilárd
>> Cheers,
>> Andreas
>> >
>> >
>> >
>> >> I know, about the nstcalcenergy, but
>> >> I need it for several of my simulations.
>> > Cheers,
>> >> Andreas
>> >>
>> >> On 26.02.20 16:50, Szilárd Páll wrote:
>> >>> Hi,
>> >>>
>> >>> Can you please check the performance when running on a single GPU
>> 2019 vs
>> >>> 2020 with your inputs?
>> >>>
>> >>> Also note that you are using some peculiar settings that will have an
>> >>> adverse effect on performance (like manually set rlist disallowing the
>> >> dual
>> >>> pair-list setup, and nstcalcenergy=1).
>> >>>
>> >>> Cheers,
>> >>>
>> >>> --
>> >>> Szilárd
>> >>>
>> >>>
>> >>> On Wed, Feb 26, 2020 at 4:11 PM Andreas Baer 
>> >> wrote:
>> >>>> Hello,
>> >>>>
>> >>>> here is a link to the logfiles.
>> >>>>
>> >>>>
>> >>
>> https://faubox.rrze.uni-erlangen.de/getlink/fiX8wP1LwSBkHRoykw6ksjqY/benchmarks_2019-2020
>> >>>> If necessary, I can als

Re: [gmx-users] Performance issues with Gromacs 2020 on GPUs - slower than 2019.5

2020-02-27 Thread Szilárd Páll
On Thu, Feb 27, 2020 at 1:08 PM Andreas Baer  wrote:

> Hi,
>
> On 27.02.20 12:34, Szilárd Páll wrote:
> > Hi
> >
> > On Thu, Feb 27, 2020 at 11:31 AM Andreas Baer 
> wrote:
> >
> >> Hi,
> >>
> >> with the link below, additional log files for runs with 1 GPU should be
> >> accessible now.
> >>
> > I meant to ask you to run single-rank GPU runs, i.e. gmx mdrun -ntmpi 1.
> >
> > It would also help if you could share some input files in case if further
> > testing is needed.
> Ok, there is now also an additional benchmark with `-ntmpi 1 -ntomp 4
> -bonded gpu -update gpu` as parameters. However, it is run on the same
> machine with smt disabled.
> With the following link, I provide all the tests on this machine, I did
> by now, along with a summary of the performance for the several input
> parameters (both in `logfiles`), as well as input files (`C60xh.7z`) and
> the scripts to run these.
>

Links seems to be missing.
--
Szilárd


> I hope, this helps. If there is anything else, I can do to help, please
> let me know!
> >
> >
> >> Thank you for the comment with the rlist, I did not know, that this will
> >> affect the performance negatively.
> >
> > It does in multiple ways. First, you are using a rather long list buffer
> > which will make the nonbonded pair-interaction calculation more
> > computational expensive than it could be if you just used a tolerance and
> > let the buffer be calculated. Secondly, as setting a manual rlist
> disables
> > the automated verlet buffer calculation, it prevents mdrun from using a
> > dual pairl-list setup (see
> >
> http://manual.gromacs.org/documentation/2018.1/release-notes/2018/major/features.html#dual-pair-list-buffer-with-dynamic-pruning
> )
> > which has additional performance benefits.
> Ok, thank you for the explanation!
> >
> > Cheers,
> > --
> > Szilárd
> Cheers,
> Andreas
> >
> >
> >
> >> I know, about the nstcalcenergy, but
> >> I need it for several of my simulations.
> > Cheers,
> >> Andreas
> >>
> >> On 26.02.20 16:50, Szilárd Páll wrote:
> >>> Hi,
> >>>
> >>> Can you please check the performance when running on a single GPU 2019
> vs
> >>> 2020 with your inputs?
> >>>
> >>> Also note that you are using some peculiar settings that will have an
> >>> adverse effect on performance (like manually set rlist disallowing the
> >> dual
> >>> pair-list setup, and nstcalcenergy=1).
> >>>
> >>> Cheers,
> >>>
> >>> --
> >>> Szilárd
> >>>
> >>>
> >>> On Wed, Feb 26, 2020 at 4:11 PM Andreas Baer 
> >> wrote:
> >>>> Hello,
> >>>>
> >>>> here is a link to the logfiles.
> >>>>
> >>>>
> >>
> https://faubox.rrze.uni-erlangen.de/getlink/fiX8wP1LwSBkHRoykw6ksjqY/benchmarks_2019-2020
> >>>> If necessary, I can also provide some more log or tpr/gro/... files.
> >>>>
> >>>> Cheers,
> >>>> Andreas
> >>>>
> >>>>
> >>>> On 26.02.20 16:09, Paul bauer wrote:
> >>>>> Hello,
> >>>>>
> >>>>> you can't add attachments to the list, please upload the files
> >>>>> somewhere to share them.
> >>>>> This might be quite important to us, because the performance
> >>>>> regression is not expected by us.
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> Paul
> >>>>>
> >>>>> On 26/02/2020 15:54, Andreas Baer wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> from a set of benchmark tests with large systems using Gromacs
> >>>>>> versions 2019.5 and 2020, I obtained some unexpected results:
> >>>>>> With the same set of parameters and the 2020 version, I obtain a
> >>>>>> performance that is about 2/3 of the 2019.5 version. Interestingly,
> >>>>>> according to nvidia-smi, the GPU usage is about 20% higher for the
> >>>>>> 2020 version.
> >>>>>> Also from the log files it seems, that the 2020 version does the
> >>>>>> computations more efficiently, but spends so much more time waiting,
> >>>>>> that the overall performance drops.
> >>>>>>

Re: [gmx-users] Performance issues with Gromacs 2020 on GPUs - slower than 2019.5

2020-02-27 Thread Szilárd Páll
Hi

On Thu, Feb 27, 2020 at 11:31 AM Andreas Baer  wrote:

> Hi,
>
> with the link below, additional log files for runs with 1 GPU should be
> accessible now.
>

I meant to ask you to run single-rank GPU runs, i.e. gmx mdrun -ntmpi 1.

It would also help if you could share some input files in case if further
testing is needed.


> Thank you for the comment with the rlist, I did not know, that this will
> affect the performance negatively.


It does in multiple ways. First, you are using a rather long list buffer
which will make the nonbonded pair-interaction calculation more
computational expensive than it could be if you just used a tolerance and
let the buffer be calculated. Secondly, as setting a manual rlist disables
the automated verlet buffer calculation, it prevents mdrun from using a
dual pairl-list setup (see
http://manual.gromacs.org/documentation/2018.1/release-notes/2018/major/features.html#dual-pair-list-buffer-with-dynamic-pruning)
which has additional performance benefits.

Cheers,
--
Szilárd



> I know, about the nstcalcenergy, but
> I need it for several of my simulations.

Cheers,
> Andreas
>
> On 26.02.20 16:50, Szilárd Páll wrote:
> > Hi,
> >
> > Can you please check the performance when running on a single GPU 2019 vs
> > 2020 with your inputs?
> >
> > Also note that you are using some peculiar settings that will have an
> > adverse effect on performance (like manually set rlist disallowing the
> dual
> > pair-list setup, and nstcalcenergy=1).
> >
> > Cheers,
> >
> > --
> > Szilárd
> >
> >
> > On Wed, Feb 26, 2020 at 4:11 PM Andreas Baer 
> wrote:
> >
> >> Hello,
> >>
> >> here is a link to the logfiles.
> >>
> >>
> https://faubox.rrze.uni-erlangen.de/getlink/fiX8wP1LwSBkHRoykw6ksjqY/benchmarks_2019-2020
> >>
> >> If necessary, I can also provide some more log or tpr/gro/... files.
> >>
> >> Cheers,
> >> Andreas
> >>
> >>
> >> On 26.02.20 16:09, Paul bauer wrote:
> >>> Hello,
> >>>
> >>> you can't add attachments to the list, please upload the files
> >>> somewhere to share them.
> >>> This might be quite important to us, because the performance
> >>> regression is not expected by us.
> >>>
> >>> Cheers
> >>>
> >>> Paul
> >>>
> >>> On 26/02/2020 15:54, Andreas Baer wrote:
> >>>> Hello,
> >>>>
> >>>> from a set of benchmark tests with large systems using Gromacs
> >>>> versions 2019.5 and 2020, I obtained some unexpected results:
> >>>> With the same set of parameters and the 2020 version, I obtain a
> >>>> performance that is about 2/3 of the 2019.5 version. Interestingly,
> >>>> according to nvidia-smi, the GPU usage is about 20% higher for the
> >>>> 2020 version.
> >>>> Also from the log files it seems, that the 2020 version does the
> >>>> computations more efficiently, but spends so much more time waiting,
> >>>> that the overall performance drops.
> >>>>
> >>>> Some background info on the benchmarks:
> >>>> - System contains about 2.1 million atoms.
> >>>> - Hardware: 2x Intel Xeon Gold 6134 („Skylake“) @3.2 GHz = 16 cores +
> >>>> SMT; 4x NVIDIA Tesla V100
> >>>>(similar results with less significant performance drop (~15%) on a
> >>>> different machine: 2 or 4 nodes with each [2x Intel Xeon 2660v2 („Ivy
> >>>> Bridge“) @ 2.2GHz = 20 cores + SMT; 2x NVIDIA Kepler K20])
> >>>> - Several options for -ntmpi, -ntomp, -bonded, -pme are used to find
> >>>> the optimal set. However the performance drop seems to be persistent
> >>>> for all such options.
> >>>>
> >>>> Two representative log files are attached.
> >>>> Does anyone have an idea, where this drop comes from, and how to
> >>>> choose the parameters for the 2020 version to circumvent this?
> >>>>
> >>>> Regards,
> >>>> Andreas
> >>>>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Performance issues with Gromacs 2020 on GPUs - slower than 2019.5

2020-02-26 Thread Szilárd Páll
Hi,

Can you please check the performance when running on a single GPU 2019 vs
2020 with your inputs?

Also note that you are using some peculiar settings that will have an
adverse effect on performance (like manually set rlist disallowing the dual
pair-list setup, and nstcalcenergy=1).

Cheers,

--
Szilárd


On Wed, Feb 26, 2020 at 4:11 PM Andreas Baer  wrote:

> Hello,
>
> here is a link to the logfiles.
>
> https://faubox.rrze.uni-erlangen.de/getlink/fiX8wP1LwSBkHRoykw6ksjqY/benchmarks_2019-2020
>
> If necessary, I can also provide some more log or tpr/gro/... files.
>
> Cheers,
> Andreas
>
>
> On 26.02.20 16:09, Paul bauer wrote:
> > Hello,
> >
> > you can't add attachments to the list, please upload the files
> > somewhere to share them.
> > This might be quite important to us, because the performance
> > regression is not expected by us.
> >
> > Cheers
> >
> > Paul
> >
> > On 26/02/2020 15:54, Andreas Baer wrote:
> >> Hello,
> >>
> >> from a set of benchmark tests with large systems using Gromacs
> >> versions 2019.5 and 2020, I obtained some unexpected results:
> >> With the same set of parameters and the 2020 version, I obtain a
> >> performance that is about 2/3 of the 2019.5 version. Interestingly,
> >> according to nvidia-smi, the GPU usage is about 20% higher for the
> >> 2020 version.
> >> Also from the log files it seems, that the 2020 version does the
> >> computations more efficiently, but spends so much more time waiting,
> >> that the overall performance drops.
> >>
> >> Some background info on the benchmarks:
> >> - System contains about 2.1 million atoms.
> >> - Hardware: 2x Intel Xeon Gold 6134 („Skylake“) @3.2 GHz = 16 cores +
> >> SMT; 4x NVIDIA Tesla V100
> >>   (similar results with less significant performance drop (~15%) on a
> >> different machine: 2 or 4 nodes with each [2x Intel Xeon 2660v2 („Ivy
> >> Bridge“) @ 2.2GHz = 20 cores + SMT; 2x NVIDIA Kepler K20])
> >> - Several options for -ntmpi, -ntomp, -bonded, -pme are used to find
> >> the optimal set. However the performance drop seems to be persistent
> >> for all such options.
> >>
> >> Two representative log files are attached.
> >> Does anyone have an idea, where this drop comes from, and how to
> >> choose the parameters for the 2020 version to circumvent this?
> >>
> >> Regards,
> >> Andreas
> >>
> >
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Fw: cudaFuncGetAttributes failed: out of memory

2020-02-26 Thread Szilárd Páll
Hi,

Indeed, there is an issue with the GPU detection code's consistency checks
that trip and abort the run if any of the detected GPUs behaves in
unexpected ways (e.g. runs out of memory during checks).

This should be fixed in an upcoming release, but until then as you have
observed, you can always restrict the set of GPUs exposed to GROMACS using
the CUDA_VISIBLE_DEVICES environment variable.

Cheers,


--
Szilárd


On Sun, Feb 23, 2020 at 7:51 AM bonjour899  wrote:

> I think I've temporarily solved this problem. Only when I use
> CUDA_VISIABLE_DEVICE to block the memory-almost-fully-occupied GPUs, I can
> run GROMACS smoothly (using -gpu_id only is useless). I think there may be
> some bug in GROMACS's GPU usage model in a multi-GPU environment (It seems
> like as long as one of the GPUs is fully occupied, GROMACS cannot submit to
> any GPUs and return an error with "cudaFuncGetAttributes failed: out of
> memory").
>
>
>
> Best regards,
> W
>
>
>
>
>  Forwarding messages 
> From: "bonjour899" 
> Date: 2020-02-23 11:32:53
> To:  gromacs.org_gmx-users@maillist.sys.kth.se
> Subject: [gmx-users] cudaFuncGetAttributes failed: out of memory
> I also tried to restricting to different GPU using -gpu_id, but still with
> the same error. I've also posting my question on
> https://devtalk.nvidia.com/default/topic/1072038/cuda-programming-and-performance/cudafuncgetattributes-failed-out-of-memory/
> Following is the output of nvidia-smi:
>
>
> +-+
>
> | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
>
>
> |---+--+--+
>
> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
>
> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
>
>
> |===+==+==|
>
> | 0 Tesla P100-PCIE... On | :04:00.0 Off | 0 |
>
> | N/A 35C P0 34W / 250W | 16008MiB / 16280MiB | 0% Default |
>
>
> +---+--+--+
>
> | 1 Tesla P100-PCIE... On | :06:00.0 Off | 0 |
>
> | N/A 35C P0 28W / 250W | 10MiB / 16280MiB | 0% Default |
>
>
> +---+--+--+
>
> | 2 Tesla P100-PCIE... On | :07:00.0 Off | 0 |
>
> | N/A 35C P0 33W / 250W | 16063MiB / 16280MiB | 0% Default |
>
>
> +---+--+--+
>
> | 3 Tesla P100-PCIE... On | :08:00.0 Off | 0 |
>
> | N/A 36C P0 29W / 250W | 10MiB / 16280MiB | 0% Default |
>
>
> +---+--+--+
>
> | 4 Quadro P4000 On | :0B:00.0 Off | N/A |
>
> | 46% 27C P8 8W / 105W | 12MiB / 8119MiB | 0% Default |
>
>
> +---+--+--+
>
>
>
>
> +-+
>
> | Processes: GPU Memory |
>
> | GPU PID Type Process name Usage |
>
>
> |=|
>
> | 0 20497 C /usr/bin/python3 5861MiB |
>
> | 0 24503 C /usr/bin/python3 10137MiB |
>
> | 2 23162 C /home/appuser/Miniconda3/bin/python 16049MiB |
>
>
> +-+
>
>
>
>
>
>
>
>  Forwarding messages 
> From: "bonjour899" 
> Date: 2020-02-20 10:30:36
> To: "gromacs.org_gmx-users@maillist.sys.kth.se" <
> gromacs.org_gmx-users@maillist.sys.kth.se>
> Subject: cudaFuncGetAttributes failed: out of memory
>
> Hello,
>
>
> I have encountered a weird problem. I've been using GROMACS with GPU on a
> server and always performance good. However when I just reran a job today
> and suddenly got this error:
>
>
>
> Command line:
>
> gmx mdrun -deffnm pull -ntmpi 1 -nb gpu -pme gpu -gpu_id 3
>
> Back Off! I just backed up pull.log to ./#pull.log.1#
>
> ---
>
> Program: gmx mdrun, version 2019.4
>
> Source file: src/gromacs/gpu_utils/gpu_utils.cu (line 100)
>
>
>
> Fatal error:
>
> cudaFuncGetAttributes failed: out of memory
>
>
>
> For more information and tips for troubleshooting, please check the GROMACS
>
> website at http://www.gromacs.org/Documentation/Errors
>
> ---
>
>
>
>
> It seems the GPU is 0 occupied and I can run other apps with GPU, but I
> cannot run GROMACS mdrun anymore, even if doing energy minimization.
>
>
>
>
>
>
>
>
>
>
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> 

Re: [gmx-users] GPU considerations for GROMACS

2020-02-24 Thread Szilárd Páll
Hi,

Whether investing in one of the fastest or two medium-high end GPU depends
on your workload: system size, type of run, single or multiple simulations,
etc. If you have multiple simulations you can run independently or coupled
only weakly in ensemble runs (e.g. using -multidir), multiple mid-tier GPUs
will be a better investment. On the other hand, if single simulation
performance is what you want to maximize and you have a relatively small
simulation system (e.g. 50k atoms), you will be better off with a single
fast GPU.

Regarding your CPU choice, I suggest you consider alternatives: e.g. a
Ryzen 3800X will cost a lot less and will be faster. Xeon generally does
not have much benefit for the use-case in question.

Cheers,
--
Szilárd


On Tue, Feb 18, 2020 at 3:21 PM hairul.ik...@gmail.com <
hairul.ik...@gmail.com> wrote:

> Hello,
>
> Previously, I have helped building a workstation for my fellow
> researcher who heavily uses GROMACS for his MD simulations, with the
> following base specs:
>
> -CPU: 8 cores (Xeon E2278G)
> -RAM: 32GB
> -GPU: 1x RTX2080Ti
>
>
> With this setup, he managed to shrink down each simulation runtime to,
> say approximately 12 hours, compared to previous system (purely CPU
> only, no GPU support), which took days to complete.
>
>
> 1) Based on the current progress, we plan to build another system
> (which will also run GROMACS most of the time) using the existing
> workstation as reference. But currently we are unsure which setup
> (Option 1 vs Option 2) will GENERALLY give shortest/fastest runtime,
> when running the same set of GROMACS simulation :
>
>
> Option 1:
> Retain same CPU, RAM and GPU specs (1x RTX2080Ti)
>
>
> Option 2:
> Retain same CPU and RAM specs, but GPU wise, use 2x RTX 2070S instead
> of 1x RTX2080Ti
>
>
> 2) Besides building another system, we are considering to upgrade the
> existing system, too. For example, assuming the system has the
> expansion capability (enough PCI-e 16x slots, power supply), will
> adding another card (making it 2x RTX2080Ti instead of 1x RTX2080Ti)
> into existing setup will significantly cut down current runtime?  If
> yes, by how much time reduction can we expect generally with this
> upgrade?
>
>
> Appreciate if someone can share their thoughts and experience.
> Thank you!
>
>
> -Hairul
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Regarding to pme on gpu

2020-02-21 Thread Szilárd Páll
Hi.
On Tue, Feb 18, 2020 at 5:11 PM Jimmy Chen  wrote:
>
> Hi,
>
> When set -pme gpu in mdrun, only one rank can be set for pme, -npme 1. What
> is the reason about only one rank for pme if use gpu to offload. Is it the
> limitation or somehow?

This is a limitation of the implementation, currently PME
decomposition is not supported with PME offload. Hence, PME work needs
to be assigned to a single GPU, be it the same as the one that does
the PP computation or a separate one (assigned to the PME rank).

> I am interesting in any performance improvement is still doable and any
> improved plan for gpu kernel of pme and pp. there is no too much different
> in pme and pp between 2019.3 and 2020.

Can you please clarify what you mean?

The 2020 release has made some improvements to both PP (bonded
kernels) and PME performance, but these can indeed be minor.

Cheers,
--
Szilárd

>
> Thanks,
> Jimmy
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Fwd: Compiling with OpenCL for Macbook AMD Radeon Pro 560 GPU

2020-02-18 Thread Szilárd Páll
Hi Oliver,

Does this affect an installation of GROMACS? In previous reports we have
observed that the issue is only present when running "make check" in the
build tree, but not in the case of an installed version.

Cheers,
--
Szilárd


On Mon, Feb 17, 2020 at 7:58 PM Oliver Dutton  wrote:

> Hello,
>
> I am trying to do the exact same as Michael in
> https://mailman-1.sys.kth.se/pipermail/gromacs.org_gmx-users/2019-February/124394.html
> <
> https://mailman-1.sys.kth.se/pipermail/gromacs.org_gmx-users/2019-February/124394.html
> >
>  but hit the exact same error of it not finding a simple header file. I’ve
> tried building 2019.5 and 2020 Gromacs on a MacBook Pro with AMD Radeon Pro
> 560 GPU.
>
> I’m using the apple inbuilt compiler, same flags and cmake options as
> Michael. Was this ever got working?
>
> Kind regards,
> Oliver
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] REMD stall out

2020-02-17 Thread Szilárd Páll
Hi Dan,

What you describe in not an expected behaviro and it is something we should
look into.

What GROMACS version were you using? One thing that may help diagnosing the
issue is: try to disable replica exchange and run -multidir that way. Does
the simulation proceed?

Can you please open an issue on redmine.gromacs.org and upload the
necessary input files to reproduce, logs of your runs that reproduced the
issue.

Cheers,
--
Szilárd


On Mon, Feb 17, 2020 at 3:56 PM Daniel Burns  wrote:

> HI Szilard,
>
> I've deleted all my output but all the writing to the log and console stops
> around the step noting the domain decomposition (or other preliminary
> task).  It is the same with or without Plumed - the TREMD with Gromacs only
> was the first thing to present this issue.
>
> I've discovered that if each replica is assigned its own node, the
> simulations proceed.  If I try to run several replicas on each node
> (divided evenly), the simulations stall out before any trajectories get
> written.
>
> I have tried many different -np and -ntomp options as well as several slurm
> job submission scripts with node/ thread configurations but multiple
> simulations per node will not work.  I need to be able to run several
> replicas on the same node to get enough data since it's hard to get more
> than 8 nodes (and as a result, replicas).
>
> Thanks for your reply.
>
> -Dan
>
> On Tue, Feb 11, 2020 at 12:56 PM Daniel Burns  wrote:
>
> > Hi,
> >
> > I continue to have trouble getting an REMD job to run.  It never makes it
> > to the point that it generates trajectory files but it never gives any
> > error either.
> >
> > I have switched from a large TREMD with 72 replicas to the Plumed
> > Hamiltonian method with only 6 replicas.  Everything is now on one node
> and
> > each replica has 6 cores.  I've turned off the dynamic load balancing on
> > this attempt per the recommendation from the Plumed site.
> >
> > Any ideas on how to troubleshoot?
> >
> > Thank you,
> >
> > Dan
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] REMD stall out

2020-02-17 Thread Szilárd Páll
Hi,

If I understand correctly your jobs stall, what is in the log output? What
about the console? Does this happen without PLUMED?

--
Szilárd


On Tue, Feb 11, 2020 at 7:56 PM Daniel Burns  wrote:

> Hi,
>
> I continue to have trouble getting an REMD job to run.  It never makes it
> to the point that it generates trajectory files but it never gives any
> error either.
>
> I have switched from a large TREMD with 72 replicas to the Plumed
> Hamiltonian method with only 6 replicas.  Everything is now on one node and
> each replica has 6 cores.  I've turned off the dynamic load balancing on
> this attempt per the recommendation from the Plumed site.
>
> Any ideas on how to troubleshoot?
>
> Thank you,
>
> Dan
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] gromacs-2020 build gcc/nvcc error

2020-01-31 Thread Szilárd Páll
uot;CMAKE" but
> fails in "make". Based on your suggestions I ran the commands at the top of
> the email to which then worked. Would this have worked if I had just
> installed gcc-8 g++-8 from the beginning and ran CMAKE with no version
> specification?
>
>
> On Thu, Jan 30, 2020 at 5:50 AM Szilárd Páll 
> wrote:
>
> > Dear Ryan,
> >
> > On Wed, Jan 29, 2020 at 10:35 PM Ryan Woltz  wrote:
> >
> > > Dear Szilárd,
> > >
> > >  Thank you for your quick response. You are correct, after
> > > issuing sudo apt-get install gcc-9 g++-9 CMake was run with:
> > >
> >
> > gcc 9 is not supported with CUDA, as far as I know version 8 is the
> latest
> > supported gcc in CUDA 10.2 (officially "native support" whatever they
> mean
> > by that is for 7.3 on Ubuntu 18.04.3, see
> > https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
> >
> > CMAKE_PREFIX_PATH=/usr/:/usr/local/cuda/ cmake ../
> -DGMX_BUILD_OWN_FFTW=ON
> > > -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=ON
> > > -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/ -DCMAKE_BUILD_TYPE=Debug
> > >
> >
> > Don't use a Debug build unless you want to debug the GROMACS tools (it's
> > slow).
> >
> > Make sure that you cmake configuration does actually use the gcc version
> > you intend to use. The default invocation as above will pick up the
> default
> > compiler toolchain (e.g. /us/bin/gcc in your case, you can verify that by
> > opening the CMakeCache.txt file or using ccmake) -- and I think the lack
> of
> > proper AVX512 support in your default gcc 5 (which you are stil; using)
> is
> > the source of the issues you report below.
> >
> > You can explicitly set the compiler by passing CMAKE_CXX_COMPILER at the
> > configure step; for details see
> >
> >
> http://manual.gromacs.org/current/install-guide/index.html?highlight=cxx%20compiler#typical-installation
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > > However now I'm getting an error in make
> > >
> > > make VERBOSE=1
> > >
> > > error:
> > >
> > > [ 25%] Building CXX object
> > >
> > >
> >
> src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/kernels_simd_2xmm/kernel_ElecEwTwinCut_VdwLJEwCombGeom_F.cpp.o
> > > In file included from
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:46:0,
> > >  from
> > >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/simd/simd.h:146,
> > >  from
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/nbnxm/nbnxm_simd.h:40,
> > >  from
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/nbnxm/kernels_simd_2xmm/kernel_ElecEwTwinCut_VdwLJEwCombGeom_F.cpp:49:
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:
> > > In function ‘void gmx::gatherLoadTransposeHsimd(const float*, const
> > float*,
> > > const int32_t*, gmx::SimdFloat*, gmx::SimdFloat*) [with int align = 2;
> > > int32_t = int]’:
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:422:28:
> > > error: the last argument must be scale 1, 2, 4, 8
> > >  tmp1 = _mm512_castpd_ps(
> > > ^
> > >
> > >
> >
> /home/rlwoltz/protein_modeling/gromacs-2020/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:424:28:
> > > error: the last argument must be scale 1, 2, 4, 8
> > >  tmp2 = _mm512_castpd_ps(
> > > ^
> > > src/gromacs/CMakeFiles/libgromacs.dir/build.make:13881: recipe for
> target
> > >
> > >
> >
> 'src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/kernels_simd_2xmm/kernel_ElecEwTwinCut_VdwLJEwCombGeom_F.cpp.o'
> > > failed
> > > make[2]: ***
> > >
> > >
> >
> [src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/kernels_simd_2xmm/kernel_ElecEwTwinCut_VdwLJEwCombGeom_F.cpp.o]
> > > Error 1
> > > CMakeFiles/Makefile2:2910: recipe for target
> > > 'src/gromacs/CMakeFiles/libgromacs.dir/all' failed
> > > make[1]: *** [src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2
> > > Makefile:162: recipe for target 'all' failed
&

Re: [gmx-users] gromacs-2020 build gcc/nvcc error

2020-01-30 Thread Szilárd Páll
_ROOT_DIR=/usr/local/cuda/ -DCMAKE_BUILD_TYPE=Debug
> -D_FORCE_INLINES=OFF
>
> Received different error described in previous email and solved with your
> suggested solution. The key might be to specifically install latest version
> number i.e.
>
> sudo apt-get install gcc-X g++-X (with X being the largest number
> available).
>
>
>
>
> On Wed, Jan 29, 2020 at 2:05 AM Szilárd Páll 
> wrote:
>
> > Hi Ryan,
> >
> > The issue you linked has been worked around in the build system, so my
> > guess is that the issue you are seeing is not related.
> >
> > I would recommend that you update your software stack to the latest
> version
> > (both CUDA 9.1 and gcc 5 are a few years old). On Ubuntu 18.04 you should
> > be able to get gcc 8 through the package manager. Together with
> > upgrading to the latest CUDA might well solve your issues.
> >
> > Let us know if that worked!
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > On Wed, Jan 29, 2020 at 12:14 AM Ryan Woltz  wrote:
> >
> > > Hello Gromacs experts,
> > >
> > >   First things first, I apologize for any double post but I
> just
> > > joined the community so I'm very new and only found 1-2 posts related
> to
> > my
> > > problem but the solutions did not work. I have been doing MD for about
> > > 6-months using NAMD but want to also try out Gromacs. That being said I
> > am
> > > slightly familiar with CPU modeling programs like Rosetta, but I am
> > totally
> > > lost when it comes to fixing errors using GPU accelerated code for
> CUDA.
> > I
> > > did find that at one point my error was fixed for an earlier version of
> > > Gromacs but Gromacs-2020 may have resurfaced the same error again, here
> > is
> > > what I think my error is:
> > >
> > > https://redmine.gromacs.org/issues/1982
> > >
> > > I am running Ubuntu 18.04.03 LTS, and gromacs-2020 I did initially
> > have
> > > the gcc/nvcc incompatible but I think installing and using gcc-5/g++-5
> > > version command in cmake has fixed that issue. I have a NVIDIA card
> with
> > > CUDA-9.1 driver when I type nvcc --version.
> > >
> > > my cmake command is as follows:
> > >
> > > CMAKE_PREFIX_PATH=/usr/:/usr/local/cuda/ cmake ../
> > > -DGMX_GPLUSPLUS_PATH=/usr/bin/g++-5 -DCUDA_HOST_COMPILER=gcc-5
> > > -DCMAKE_CXX_COMPILER=g++-5 -DCMAKE_C_COMPILER=/usr/bin/gcc-5
> > > -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=ON
> > > -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/ -DCMAKE_BUILD_TYPE=Debug (I
> did
> > > try adding -D_FORCE_INLINES= based on the link above in my running
> > command
> > > but it did not work). I did look at the error log but it is way over my
> > > head. I have in addition deleted the CMakeCache.txt file or the
> unpacked
> > > Gromacs and re-unzipped it to restart the cmake process to make sure it
> > was
> > > starting "clean". Is there any additional information I could provide?
> > Does
> > > anyone have a suggestion? Again I'm sorry if this is a duplicate,
> > > everything I found on other sites was way over my head and I generally
> > > understand what is going on but the forums I read on possible solutions
> > > seem way over my head and I'm afraid I will break the driver if I
> attempt
> > > them (which has happened to me already and the computer required a full
> > > reinstall).
> > >
> > > here is last lines from the build:
> > >
> > > -- Found HWLOC: /usr/lib/x86_64-linux-gnu/libhwloc.so (found suitable
> > > version "1.11.6", minimum required is "1.5")
> > > -- Looking for C++ include pthread.h
> > > -- Looking for C++ include pthread.h - found
> > > -- Atomic operations found
> > > -- Performing Test PTHREAD_SETAFFINITY
> > > -- Performing Test PTHREAD_SETAFFINITY - Success
> > > -- Adding work-around for issue compiling CUDA code with glibc 2.23
> > > string.h
> > > -- Check for working NVCC/C++ compiler combination with nvcc
> > > '/usr/local/cuda/bin/nvcc'
> > > -- Check for working NVCC/C compiler combination - broken
> > > -- /usr/local/cuda/bin/nvcc standard output: ''
> > > -- /usr/local/cuda/bin/nvcc standard error:
> > >  '/home/rlwoltz/protein_modeling/gromacs-2020/build/gcc-5: No such file
> > or
> > > directory
> > > '
> > > CMake Error at cmake/gmxManageNvccConfig.

Re: [gmx-users] gromacs-2020 build gcc/nvcc error

2020-01-29 Thread Szilárd Páll
Hi Ryan,

The issue you linked has been worked around in the build system, so my
guess is that the issue you are seeing is not related.

I would recommend that you update your software stack to the latest version
(both CUDA 9.1 and gcc 5 are a few years old). On Ubuntu 18.04 you should
be able to get gcc 8 through the package manager. Together with
upgrading to the latest CUDA might well solve your issues.

Let us know if that worked!

Cheers,
--
Szilárd


On Wed, Jan 29, 2020 at 12:14 AM Ryan Woltz  wrote:

> Hello Gromacs experts,
>
>   First things first, I apologize for any double post but I just
> joined the community so I'm very new and only found 1-2 posts related to my
> problem but the solutions did not work. I have been doing MD for about
> 6-months using NAMD but want to also try out Gromacs. That being said I am
> slightly familiar with CPU modeling programs like Rosetta, but I am totally
> lost when it comes to fixing errors using GPU accelerated code for CUDA. I
> did find that at one point my error was fixed for an earlier version of
> Gromacs but Gromacs-2020 may have resurfaced the same error again, here is
> what I think my error is:
>
> https://redmine.gromacs.org/issues/1982
>
> I am running Ubuntu 18.04.03 LTS, and gromacs-2020 I did initially have
> the gcc/nvcc incompatible but I think installing and using gcc-5/g++-5
> version command in cmake has fixed that issue. I have a NVIDIA card with
> CUDA-9.1 driver when I type nvcc --version.
>
> my cmake command is as follows:
>
> CMAKE_PREFIX_PATH=/usr/:/usr/local/cuda/ cmake ../
> -DGMX_GPLUSPLUS_PATH=/usr/bin/g++-5 -DCUDA_HOST_COMPILER=gcc-5
> -DCMAKE_CXX_COMPILER=g++-5 -DCMAKE_C_COMPILER=/usr/bin/gcc-5
> -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=ON
> -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/ -DCMAKE_BUILD_TYPE=Debug (I did
> try adding -D_FORCE_INLINES= based on the link above in my running command
> but it did not work). I did look at the error log but it is way over my
> head. I have in addition deleted the CMakeCache.txt file or the unpacked
> Gromacs and re-unzipped it to restart the cmake process to make sure it was
> starting "clean". Is there any additional information I could provide? Does
> anyone have a suggestion? Again I'm sorry if this is a duplicate,
> everything I found on other sites was way over my head and I generally
> understand what is going on but the forums I read on possible solutions
> seem way over my head and I'm afraid I will break the driver if I attempt
> them (which has happened to me already and the computer required a full
> reinstall).
>
> here is last lines from the build:
>
> -- Found HWLOC: /usr/lib/x86_64-linux-gnu/libhwloc.so (found suitable
> version "1.11.6", minimum required is "1.5")
> -- Looking for C++ include pthread.h
> -- Looking for C++ include pthread.h - found
> -- Atomic operations found
> -- Performing Test PTHREAD_SETAFFINITY
> -- Performing Test PTHREAD_SETAFFINITY - Success
> -- Adding work-around for issue compiling CUDA code with glibc 2.23
> string.h
> -- Check for working NVCC/C++ compiler combination with nvcc
> '/usr/local/cuda/bin/nvcc'
> -- Check for working NVCC/C compiler combination - broken
> -- /usr/local/cuda/bin/nvcc standard output: ''
> -- /usr/local/cuda/bin/nvcc standard error:
>  '/home/rlwoltz/protein_modeling/gromacs-2020/build/gcc-5: No such file or
> directory
> '
> CMake Error at cmake/gmxManageNvccConfig.cmake:189 (message):
>   CUDA compiler does not seem to be functional.
> Call Stack (most recent call first):
>   cmake/gmxManageGPU.cmake:207 (include)
>   CMakeLists.txt:577 (gmx_gpu_setup)
>
>
> -- Configuring incomplete, errors occurred!
> See also
>
> "/home/rlwoltz/protein_modeling/gromacs-2020/build/CMakeFiles/CMakeOutput.log".
> See also
>
> "/home/rlwoltz/protein_modeling/gromacs-2020/build/CMakeFiles/CMakeError.log".
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Error: Cannot find AVX 512F compiler flag

2020-01-15 Thread Szilárd Páll
Hi,

What hardware are you targeting? Unless you need AVX512 support, you could
just manually specify the appropriate setting in GMX_SIMD, e.g.
-DGMX_SIMD=AVX2_256 would be appropriate for most cases where AVX512 is not
supported.

Cheers,
--
Szilárd


On Wed, Jan 15, 2020 at 9:51 AM Shlomit Afgin 
wrote:

>
> Hi,
> I tried to install GROMACS 2019.5 on CentOS7,
> cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON
>
> I have already installed devtoolset-6 and still get this error:
>
> -- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS
> -- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS - Failed
> -- Could not find any flag to build test source (this could be due to
> either the compiler or binutils)
> -- Could not identify number of AVX-512 units - detection program missing
> compilation prerequisites
> -- Could not run code to detect number of AVX-512 FMA units - assuming 2.
> -- Detected best SIMD instructions for this CPU - AVX_512
> CMake Error at cmake/gmxManageSimd.cmake:51 (message):
>   Cannot find AVX 512F compiler flag.  Use a newer compiler, or choose a
>   lower level of SIMD (slower).
> Call Stack (most recent call first):
>   cmake/gmxManageSimd.cmake:186
> (gmx_give_fatal_error_when_simd_support_not_found)
>   CMakeLists.txt:719 (gmx_manage_simd)
>
>
> Thanks,
> Shlomit
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Gromacs 2019 - Ryzen Architecture

2020-01-09 Thread Szilárd Páll
Good catch Kevin, that is likely an issue -- at least part of it.

Note that you can also use the mdrun -multidir functionality to avoid
having to manually manage mdrun process placement and pinning.

Another aspect is that if you leave half of the CPU cores unused, the cores
in use can boost to a higher clock rate and therefore can complete the work
on the CPU quicker which, as part of this work does not overlap with the
GPU, will impact the fraction of the time the GPU will be idle (and hence
also the time the GPU will be busy). For a fair comparison, run something
on those otherwise idle cores (at least a "stress -c 8" or possibly a
CPU-only mdrun); generally this is how we evaluate performance as a
function of CPU cores per GPU).

Cheers,
--
Szilárd


On Sat, Jan 4, 2020 at 9:11 PM Kevin Boyd  wrote:

> Hi,
>
> A few things besides any Ryzen-specific issues. First, your pinoffset for
> the second one should be 16, not 17. The way yours is set up, you're
> running on cores 0-15, then Gromacs will detect that your second
> simulation parameters are invalid (because from cores 17-32, core 32 does
> not exist) and turn off core pinning. You can verify that in the log file.
>
> Second, 16 threads per simulation is overkill, and you can get gains from
> stealing from GPU down-time by running 2 simulations per GPU. So I would
> suggest something like
>
> mdrun -nt 8 -pin on -pinoffset 0 -gpu_id 0 &
> mdrun -nt 8 -pin on -pinoffset 8 -gpu_id 0 &
> mdrun -nt 8 -pin on -pinoffset 16 -gpu_id 1 &
> mdrun -nt 8 -pin on -pinoffset 24 -gpu_id 1
>
> might give you close to optimal performance.
>
> On Thu, Jan 2, 2020 at 5:32 AM Paul bauer  wrote:
>
> > Hello,
> >
> > we only added full detection and support for the newer Rizen chip-sets
> > with GROMACS 2019.5, so please try if the update to this version solves
> > your issue.
> > If not, please open an issue on redmine.gromacs.org so we can track the
> > problem and try to solve it.
> >
> > Cheers
> >
> > Paul
> >
> > On 02/01/2020 13:26, Sandro Wrzalek wrote:
> > > Hi,
> > >
> > > happy new year!
> > >
> > > Now to my problem:
> > >
> > > I use Gromacs 2019.3 and to try to run some simulations (roughly 30k
> > > atoms per system) on my PC which has the following configuration:
> > >
> > > CPU: Ryzen 3950X (overclocked to 4.1 GHz)
> > >
> > > GPU #1: Nvidia RTX 2080 Ti
> > >
> > > GPU #2: Nvidia RTX 2080 Ti
> > >
> > > RAM: 64 GB
> > >
> > > PSU: 1600 Watts
> > >
> > >
> > > Each run uses one GPU and 16 of 32 logical cores. Doing only one run
> > > at time (gmx mdrun -deffnm rna0 -gpu_id 0 -nb gpu -pme gpu) the GPU
> > > utilization is roughly around 84% but if I add a second run, the
> > > utilization of both GPUs drops to roughly 20%, while leaving logical
> > > cores 17-32 idle (I changed parameter gpu_id, accordingly).
> > >
> > > Adding additional parameters for each run:
> > >
> > > gmx mdrun -deffnm rna0 -nt 16 -pin on -pinoffset 0 -gpu_id 0 -nb gpu
> > > -pme gpu
> > >
> > > gmx mdrun -deffnm rna0 -nt 16 -pin on -pinoffset 17 -gpu_id 1 -nb gpu
> > > -pme gpu
> > >
> > > I get a utilization of 78% per GPU, which is nice but not near the 84%
> > > I got with only one run. In theory, however, it should come at least
> > > close to that utilization.
> > >
> > > I suspect, the Ryzen Chiplet design as the culprit since Gromacs seems
> > > to prefer the the first Chiplet, even if two simultaneous simulations
> > > have the resources to occupy both. The reason for the 78% utilization
> > > could be because of overhead between the two Chiplets via the infinity
> > > band. However, I have no proof, nor am I able to explain why gmx mdrun
> > > -deffnm rna0 -nt 16 -gpu_id 0 & 1 -nb gpu -pme gpu works as well -
> > > seems to occupy free logical cores then.
> > >
> > > Long story short:
> > >
> > > Are there any workarounds to squeeze the last bit out of my setup? Is
> > > it possible to choose the logical cores manually (I did not found
> > > anything in the docs so far)?
> > >
> > >
> > > Thank you for your help!
> > >
> > >
> > > Best,
> > >
> > > Sandro
> > >
> >
> > --
> > Paul Bauer, PhD
> > GROMACS Development Manager
> > KTH Stockholm, SciLifeLab
> > 0046737308594
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing 

Re: [gmx-users] is GPU peer access(RDMA) supported with inter-node and gmx2020 mpi version?

2020-01-09 Thread Szilárd Páll
On Wed, Jan 8, 2020 at 5:00 PM Jimmy Chen  wrote:

> Hi,
>
> is GPU peer access(RDMA) supported with inter-node and gmx2020 mpi version
> on NVidia GPU?
>

No, that is currently not implemented.

Cheers,
--
Szilárd

or just work only in single-node with threadMPI via Nvidia GPU direct?
>
> Thanks,
> Jimmy
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Gromacs 2019.4 - cudaStreamSynchronize failed issue

2019-12-05 Thread Szilárd Páll
Can you please file an issue on redmine.gromacs.org and attach the inputs
that reproduce the behavior described?

--
Szilárd

On Wed, Dec 4, 2019, 21:35 Chenou Zhang  wrote:

> We did test that.
> Our cluster has total 11 GPU nodes and I ran 20 tests over all of them. 7
> out of the 20 tests did have the potential energy jump issue and they were
> running on 5 different nodes.
> So I tend to believe this issue happens on any of those nodes.
>
> On Wed, Dec 4, 2019 at 1:14 PM Szilárd Páll 
> wrote:
>
> > The fact that you are observing errors alo the energies to be off by so
> > much and that it reproduces with multiple inputs suggest that this may
> not
> > a code issue. Did you do all runs that failed on the same hardware? Have
> > you excluded the option that one of those GeForce cards may be flaky?
> >
> > --
> > Szilárd
> >
> >
> > On Wed, Dec 4, 2019 at 7:47 PM Chenou Zhang  wrote:
> >
> > > We tried the same gmx settings in 2019.4 with different protein
> systems.
> > > And we got the same weird potential energy jump  within 1000 steps.
> > >
> > > ```
> > >
> > > Step   Time
> > >   00.0
> > >  Energies (kJ/mol)
> > >BondU-BProper Dih.  Improper Dih.  CMAP
> > Dih.
> > > 2.08204e+049.92358e+046.53063e+041.06706e+03
> >  -2.75672e+02
> > >   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul.
> > recip.
> > > 1.50031e+04   -4.86857e+043.10386e+04   -1.09745e+06
> > 4.81832e+03
> > >   PotentialKinetic En.   Total Energy  Conserved En.
> > Temperature
> > >-9.09123e+052.80635e+05   -6.28487e+05   -6.28428e+05
> > 3.04667e+02
> > >  Pressure (bar)   Constr. rmsd
> > >-1.56013e+003.60634e-06
> > >
> > > DD  step 999 load imb.: force 14.6%  pme mesh/force 0.581
> > >Step   Time
> > >10002.0
> > >
> > > Energies (kJ/mol)
> > >BondU-BProper Dih.  Improper Dih.  CMAP
> > Dih.
> > > 2.04425e+049.92768e+046.52873e+041.02016e+03
> >  -2.45851e+02
> > >   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul.
> > recip.
> > > 1.49863e+04   -4.91092e+043.10572e+04   -1.09508e+06
> > 4.97942e+03
> > >   PotentialKinetic En.   Total Energy  Conserved En.
> > Temperature
> > > 1.35726e+352.77598e+051.35726e+351.35726e+35
> > 3.01370e+02
> > >  Pressure (bar)   Constr. rmsd
> > >-7.55250e+013.63239e-06
> > >
> > >  DD  step 1999 load imb.: force 16.1%  pme mesh/force 0.598
> > >Step   Time
> > >20004.0
> > >
> > > Energies (kJ/mol)
> > >BondU-BProper Dih.  Improper Dih.  CMAP
> > Dih.
> > > 1.99521e+049.97482e+046.49595e+041.00798e+03
> >  -2.42567e+02
> > >   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul.
> > recip.
> > > 1.50156e+04   -4.85324e+043.01944e+04   -1.09620e+06
> > 4.82958e+03
> > >   PotentialKinetic En.   Total Energy  Conserved En.
> > Temperature
> > > 1.35726e+352.79206e+051.35726e+351.35726e+35
> > 3.03115e+02
> > >  Pressure (bar)   Constr. rmsd
> > >-5.50508e+013.64353e-06
> > >
> > > DD  step 2999 load imb.: force 16.6%  pme mesh/force 0.602
> > >Step   Time
> > >30006.0
> > >
> > >
> > > Energies (kJ/mol)
> > >BondU-BProper Dih.  Improper Dih.  CMAP
> > Dih.
> > > 1.98590e+049.88100e+046.50934e+041.07048e+03
> >  -2.38831e+02
> > >   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul.
> > recip.
> > > 1.49609e+04   -4.93079e+043.12273e+04   -1.09582e+06
> > 4.83209e+03
> > >   PotentialKinetic En.   Total Energy  Conserved En.
> > Temperature
> > > 1.35726e+352.79438e+051.35726e+351.35726e+35
> > 3.03367e+02
> > >  Pressure (bar)   Constr. rmsd
> > > 7.62438e+013.61574e-06
> > >
> > > ```
> > >
> > > On Mon, Dec 2, 2019 at 2:13 PM Mark Abraham 
> > > wrote:
> > >
> > > > Hi,
> > > >
>

Re: [gmx-users] Gromacs 2019.4 - cudaStreamSynchronize failed issue

2019-12-04 Thread Szilárd Páll
The fact that you are observing errors alo the energies to be off by so
much and that it reproduces with multiple inputs suggest that this may not
a code issue. Did you do all runs that failed on the same hardware? Have
you excluded the option that one of those GeForce cards may be flaky?

--
Szilárd


On Wed, Dec 4, 2019 at 7:47 PM Chenou Zhang  wrote:

> We tried the same gmx settings in 2019.4 with different protein systems.
> And we got the same weird potential energy jump  within 1000 steps.
>
> ```
>
> Step   Time
>   00.0
>  Energies (kJ/mol)
>BondU-BProper Dih.  Improper Dih.  CMAP Dih.
> 2.08204e+049.92358e+046.53063e+041.06706e+03   -2.75672e+02
>   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul. recip.
> 1.50031e+04   -4.86857e+043.10386e+04   -1.09745e+064.81832e+03
>   PotentialKinetic En.   Total Energy  Conserved En.Temperature
>-9.09123e+052.80635e+05   -6.28487e+05   -6.28428e+053.04667e+02
>  Pressure (bar)   Constr. rmsd
>-1.56013e+003.60634e-06
>
> DD  step 999 load imb.: force 14.6%  pme mesh/force 0.581
>Step   Time
>10002.0
>
> Energies (kJ/mol)
>BondU-BProper Dih.  Improper Dih.  CMAP Dih.
> 2.04425e+049.92768e+046.52873e+041.02016e+03   -2.45851e+02
>   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul. recip.
> 1.49863e+04   -4.91092e+043.10572e+04   -1.09508e+064.97942e+03
>   PotentialKinetic En.   Total Energy  Conserved En.Temperature
> 1.35726e+352.77598e+051.35726e+351.35726e+353.01370e+02
>  Pressure (bar)   Constr. rmsd
>-7.55250e+013.63239e-06
>
>  DD  step 1999 load imb.: force 16.1%  pme mesh/force 0.598
>Step   Time
>20004.0
>
> Energies (kJ/mol)
>BondU-BProper Dih.  Improper Dih.  CMAP Dih.
> 1.99521e+049.97482e+046.49595e+041.00798e+03   -2.42567e+02
>   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul. recip.
> 1.50156e+04   -4.85324e+043.01944e+04   -1.09620e+064.82958e+03
>   PotentialKinetic En.   Total Energy  Conserved En.Temperature
> 1.35726e+352.79206e+051.35726e+351.35726e+353.03115e+02
>  Pressure (bar)   Constr. rmsd
>-5.50508e+013.64353e-06
>
> DD  step 2999 load imb.: force 16.6%  pme mesh/force 0.602
>Step   Time
>30006.0
>
>
> Energies (kJ/mol)
>BondU-BProper Dih.  Improper Dih.  CMAP Dih.
> 1.98590e+049.88100e+046.50934e+041.07048e+03   -2.38831e+02
>   LJ-14 Coulomb-14LJ (SR)   Coulomb (SR)   Coul. recip.
> 1.49609e+04   -4.93079e+043.12273e+04   -1.09582e+064.83209e+03
>   PotentialKinetic En.   Total Energy  Conserved En.Temperature
> 1.35726e+352.79438e+051.35726e+351.35726e+353.03367e+02
>  Pressure (bar)   Constr. rmsd
> 7.62438e+013.61574e-06
>
> ```
>
> On Mon, Dec 2, 2019 at 2:13 PM Mark Abraham 
> wrote:
>
> > Hi,
> >
> > What driver version is reported in the respective log files? Does the
> error
> > persist if mdrun -notunepme is used?
> >
> > Mark
> >
> > On Mon., 2 Dec. 2019, 21:18 Chenou Zhang,  wrote:
> >
> > > Hi Gromacs developers,
> > >
> > > I'm currently running gromacs 2019.4 on our university's HPC cluster.
> To
> > > fully utilize the GPU nodes, I followed notes on
> > >
> > >
> >
> http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html
> > > .
> > >
> > >
> > > And here is the command I used for my runs.
> > > ```
> > > gmx mdrun -v -s $TPR -deffnm md_seed_fixed -ntmpi 8 -pin on -nb gpu
> > -ntomp
> > > 3 -pme gpu -pmefft gpu -npme 1 -gputasks 00112233 -maxh $HOURS -cpt 60
> > -cpi
> > > -noappend
> > > ```
> > >
> > > And for some of those runs, they might fail with the following error:
> > > ```
> > > ---
> > >
> > > Program: gmx mdrun, version 2019.4
> > >
> > > Source file: src/gromacs/gpu_utils/cudautils.cuh (line 229)
> > >
> > > MPI rank:3 (out of 8)
> > >
> > >
> > >
> > > Fatal error:
> > >
> > > cudaStreamSynchronize failed: an illegal memory access was encountered
> > >
> > >
> > >
> > > For more information and tips for troubleshooting, please check the
> > GROMACS
> > >
> > > website at http://www.gromacs.org/Documentation/Errors
> > > ```
> > >
> > > we also had a different error from slurm system:
> > > ```
> > > ^Mstep 4400: timed with pme grid 96 96 60, coulomb cutoff 1.446: 467.9
> > > M-cycles
> > > ^Mstep 4600: timed with pme grid 96 96 64, coulomb cutoff 1.372: 451.4
> > > M-cycles
> > > /var/spool/slurmd/job2321134/slurm_script: line 44: 29866 Segmentation
> > > fault  gmx mdrun -v -s $TPR -deffnm 

Re: [gmx-users] Fatal Error when launching gromacs 2019.2 on GPU.

2019-10-28 Thread Szilárd Páll
Hi,

Indeed, the standard way provided by CUDA to expose a subset of GPUs to an
application is the CUDA_VISIBLE_DEVICES (note the "S" ending); I did not
realize that is something you were interested in, I thought you wanted to
avoid using GPUs.

Also note (for anyone interested), a when using a queue system, if node
sharing is allowed, the scheduler should be set up to a correct
CUDA_VISIBLE_DEVICES variable as well.

Cheers,
--
Szilárd


On Mon, Oct 28, 2019 at 1:43 PM Artem Shekhovtsov 
wrote:

> Hi,
>
> Thanks, setting this variable allowed me to start GROMACS without errors
> using the CPU.
> The problem is that this method prevents me from using other free GPUs on
> the host, but I would like to do this.
> I also found out that setting the CUDA_VISIBLE_DEVICE variable to available
> GPUs at the time of launch allows us to solve this problem.
>
> Artem
>
>
> On Sat, Oct 26, 2019 at 1:50 AM Szilárd Páll 
> wrote:
>
> > Hi,
> >
> > This is an issue in one of pre-detection checks that trips due to
> > encountering exclusive / prohibited mode devices.
> >
> > You can work around this by entirely disabling the detection using the
> > GMX_DISABLE_GPU_DETECTION environment variable.
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > On Thu, Oct 17, 2019 at 5:01 PM Artem Shekhovtsov <
> > job.shekhovt...@gmail.com>
> > wrote:
> >
> > > Hello!
> > > Problem: The launch of mdrun that does not require video cards exit
> with
> > > fatal error if at least one video card is busy on the host at that
> time.
> > > gmx mdrun -deffnm test -ntmpi 1 -ntomp 1 -nb cpu -bonded cpu
> > > ---
> > > Program: gmx mdrun, version 2019.2
> > > Source file: src/gromacs/gpu_utils/gpu_utils.cu (line 100)
> > >
> > > Fatal error:
> > > cudaFuncGetAttributes failed: all CUDA-capable devices are busy or
> > > unavailable
> > >
> > > For more information and tips for troubleshooting, please check the
> > GROMACS
> > > website at http://www.gromacs.org/Documentation/Errors
> > > ---
> > >
> > > I have this error in gromacs version 2019.2, 2019.3, 2020.beta.
> > > Version - 2018.6 is not affected.
> > > All version builds with the same flags.
> > >
> > > Archive with log files and gromacs build files
> > >
> > >
> >
> https://drive.google.com/file/d/1ahn7S69CU5yvAPlLWHryXmMzcGfdWVxP/view?usp=sharing
> > >
> > >
> > > I would appreciate any help.
> > >
> > > Thanks,
> > > Artem Shekhovtsov.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-requ...@gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Reg: GPU use

2019-10-28 Thread Szilárd Páll
Dear Bidhan Chandra Garain,

Please share the log files of your benchmarks, that will help us better
identify if there is an issue and what the issue is.

Thanks,
--
Szilárd


On Mon, Oct 28, 2019 at 8:51 AM Bidhan Chandra Garain 
wrote:

> Respected Sir,
> In my lab we have recently installed a GPU Tesla V-100-PCIE, and installed
> gromacs-2018.4. I tried to check its performance. I ran a job 128 dmpc
> lipid simulation with 23453 atom in my lab cpu with 4 processor with 2
> threading in each processor. There it's performance is ~8 ns/day. But when
> the same job I tried to run in gpu using the following command gmx_mpi
> mdrun -v -deffnm npt -nb gpu, not only it is taking 40 cpu processors but
> also it is very slow. Performnce is ~1 ns/day. Can you please suggest me
> why is it taking the 40 cpu processors and how to get the maximum
> performance?
>
> Thanking you in advance.
>
> Sincerely,
> Bidhan Chandra Garain
> Ph.D. Student, JNCASR
> Bangalore, India
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Fatal Error when launching gromacs 2019.2 on GPU.

2019-10-25 Thread Szilárd Páll
Hi,

This is an issue in one of pre-detection checks that trips due to
encountering exclusive / prohibited mode devices.

You can work around this by entirely disabling the detection using the
GMX_DISABLE_GPU_DETECTION environment variable.

Cheers,
--
Szilárd


On Thu, Oct 17, 2019 at 5:01 PM Artem Shekhovtsov 
wrote:

> Hello!
> Problem: The launch of mdrun that does not require video cards exit with
> fatal error if at least one video card is busy on the host at that time.
> gmx mdrun -deffnm test -ntmpi 1 -ntomp 1 -nb cpu -bonded cpu
> ---
> Program: gmx mdrun, version 2019.2
> Source file: src/gromacs/gpu_utils/gpu_utils.cu (line 100)
>
> Fatal error:
> cudaFuncGetAttributes failed: all CUDA-capable devices are busy or
> unavailable
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> ---
>
> I have this error in gromacs version 2019.2, 2019.3, 2020.beta.
> Version - 2018.6 is not affected.
> All version builds with the same flags.
>
> Archive with log files and gromacs build files
>
> https://drive.google.com/file/d/1ahn7S69CU5yvAPlLWHryXmMzcGfdWVxP/view?usp=sharing
>
>
> I would appreciate any help.
>
> Thanks,
> Artem Shekhovtsov.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Fatal Error when launching gromacs 2019.2 on GPU.

2019-10-22 Thread Szilárd Páll
Hi,

Can you please file an issue on redmine.gromacs.org with the description
you gave here?

Thanks,
--
Szilárd


On Thu, Oct 17, 2019 at 5:01 PM Artem Shekhovtsov 
wrote:

> Hello!
> Problem: The launch of mdrun that does not require video cards exit with
> fatal error if at least one video card is busy on the host at that time.
> gmx mdrun -deffnm test -ntmpi 1 -ntomp 1 -nb cpu -bonded cpu
> ---
> Program: gmx mdrun, version 2019.2
> Source file: src/gromacs/gpu_utils/gpu_utils.cu (line 100)
>
> Fatal error:
> cudaFuncGetAttributes failed: all CUDA-capable devices are busy or
> unavailable
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> ---
>
> I have this error in gromacs version 2019.2, 2019.3, 2020.beta.
> Version - 2018.6 is not affected.
> All version builds with the same flags.
>
> Archive with log files and gromacs build files
>
> https://drive.google.com/file/d/1ahn7S69CU5yvAPlLWHryXmMzcGfdWVxP/view?usp=sharing
>
>
> I would appreciate any help.
>
> Thanks,
> Artem Shekhovtsov.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GROMACS showing error

2019-10-22 Thread Szilárd Páll
Hi,

Please direct GROMACS usage questions to the users' list. Replying there,
make sure you are subscribed and continue the conversation there.

The issue is that you requested static library detection, but the hwloc
library dependencies are not correctly added to the GROMACS link
dependencies. There are a few workarounds:
- avoid  -DGMX_PREFER_STATIC_LIBS=ON
- use dynamic libs for hwloc (e.g. passing -DHWLOC_hwloc_LIBRARY manually)
- if you prefer to stick to statically linked external libraries and the
above don't work our, you can turn off hwloc support (-DGMX_HWLOC=OFF)

Cheers,
--
Szilárd


On Fri, Oct 18, 2019 at 11:09 PM Shradheya R.R. Gupta <
shradheyagu...@gmail.com> wrote:

> Respected sir,
>
> While installing Gromacs 2019.4 with GPU+MPI  I got the error at linking
> of MPI.
>
> *commands:-*
>
> mkdir build
>
> cd build
>
> cmake .. -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_OPENMP=ON -DGMX_GPU=ON
> -DGMX_BUILD_OWN_FFTW=ON -DGMX_PREFER_STATIC_LIBS=ON
> -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX= -DGMX_MPI=ON
> -DGMX_BUILD_UNITTESTS=ON -DCMAKE_C_COMPILER=MPICC
> -DCMAKE_CXX_COMPILER=mpicxx
>
> make (completed successfully)
>
> sudo make install
>
> After 98% completion it showed the error
>
> [image: IMG_20191017_191549.jpg]
>
>
> Sir, please suggest how can I resolve it, eagerly waiting for your reply.
> Thank you
>
> Shradheya R.R. Gupta
> Bioinformatics Infrastructure Facility- DBT - Government of India
>  University of Rajasthan,India
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Question about default auto setting of mdrun -pin

2019-10-18 Thread Szilárd Páll
On Fri, Oct 18, 2019 at 4:36 PM  wrote:

> On Thu, Oct 17, 2019 at 10:34:39AM +, Kutzner, Carsten wrote:
> >
> > is it intended that the thread-MPI version of mdrun 2018 does pin to its
> core
> > if started with -nt 1 -pin auto?
>

No, I don't think that's intended.


>
> I think I have a (partial) idea about what's happening.
> get_thread_affinity_layout() performs various checks to determin if
> pinning is possible, and for -pin auto, it looks at the number of
> mdrun threads vs hardware threads. However, the number of mdrun
> threads ultimately comes from gmx_omp_nthreads_get().
> Of course, if OMP_NUM_THREADS is unset, OpenMP will default to the
> whole machine, and the above check will always succeed. And sure
> enough, setting OMP_NUM_THREADS to 1 will turn off pinning in our test
> case.
>

The source of total number of hardware threads should not be OpenMP but a
system call or hwloc -- which should (and was IIRC a while ago at least)
checked agianst the  gmx_omp_nthreads_get() value.

If however the behavior you describe is actually reproducible, please do
file a bug report.

--
Szilárd


>
> A.
>
> --
> Ansgar Esztermann
> Sysadmin Dep. Theoretical and Computational Biophysics
> http://www.mpibpc.mpg.de/grubmueller/esztermann
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Help with a failing test - gromacs 2019.4 - Test 42

2019-10-16 Thread Szilárd Páll
Hi,

The issue is an internal error triggered by the domain decomposition not
liking 14 cores in your CPU which lead to a prime rank count.
To ensure the tests pass I suggest trying to force only one device to be
used in make check, e.g. CUDA_VISIBLE_DEVICES=0 make check; alternatively
you can run the regressiontests manually.

Cheers,
--
Szilárd


On Thu, Oct 10, 2019 at 6:01 PM Raymond Arter 
wrote:

> Hi,
>
> When performing a "make check" on Gromacs 2019.4, I'm getting test 42
> failing.
> It gives the error:
>
> Mdrun cannot use the requested (or automatic) number of ranks,
> retrying with 8
>
> And the mdrun.out and md.log of swap_x reports:
>
> The number of ranks you selected (14) contains a large prime factor
> 7.
>
> I've included the necessary parts of the logs below. Any help would be
> appreciated
> since I haven't come across this error before.
>
> Regards,
>
> T.
>
>
> CentOS Linux release 7.6.1810 (Core)
> CPU: Intel Xeon Gold 6132
> Tesla V100
> Cuda: 10.1
> Driver: 418.40.04
>
> Output of "make check"
>
> 42/46 Test #42: regressiontests/complex .***Failed  145.88 sec
>
> GROMACS:  gmx mdrun, version 2019.4
> Executable:   /gromacs/2019.4/gromacs-2019.4/build/bin/gmx
> Data prefix:  /gromacs/2019.4/gromacs-2019.4 (source tree)
> Working dir:
>  /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4
> Command line:
>   gmx mdrun -h
>
> Thanx for Using GROMACS - Have a Nice Day
>
> Mdrun cannot use the requested (or automatic) number of ranks, retrying
> with 8.
>
> Abnormal return value for ' gmx mdrun-nb cpu   -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
> Re-running orientation-restraints using CPU-based PME
> Re-running pull_geometry_angle using CPU-based PME
> Re-running pull_geometry_angle-axis using CPU-based PME
> Re-running pull_geometry_dihedral using CPU-based PME
>
> Abnormal return value for ' gmx mdrun   -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x
>
> Abnormal return value for ' gmx mdrun   -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y
>
> Abnormal return value for ' gmx mdrun   -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z
> 3 out of 55 complex tests FAILED
>
> From the following directory:
>
> /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x
> and I get the same errors for swap_y and swap_z
>
> == mdrun.out ==
>
> GROMACS:  gmx mdrun, version 2019.4
> Executable:   /gromacs/2019.4/gromacs-2019.4/build/bin/gmx
> Data prefix:  /gromacs/2019.4/gromacs-2019.4 (source tree)
> Working dir:
>
>  
> /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x
> Command line:
>   gmx mdrun -notunepme
>
> Reading file topol.tpr, VERSION 2019.4 (single precision)
> Changing nstlist from 10 to 50, rlist from 1.011 to 1.137
>
> ---
> Program: gmx mdrun, version 2019.4
> Source file: src/gromacs/domdec/domdec_setup.cpp (line 764)
> MPI rank:0 (out of 14)
>
> Fatal error:
> The number of ranks you selected (14) contains a large prime factor 7. In
> most
> cases this will lead to bad performance. Choose a number with smaller prime
> factors or set the decomposition (option -dd) manually.
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> ---
>
> == md.out ==
>
> Changing nstlist from 10 to 50, rlist from 1.011 to 1.137
>
> Initializing Domain Decomposition on 14 ranks
> Dynamic load balancing: locked
> Minimum cell size due to atom displacement: 0.692 nm
> Initial maximum distances in bonded interactions:
> two-body bonded interactions: 0.403 nm, Exclusion, atoms 184 187
>   multi-body bonded interactions: 0.403 nm, Ryckaert-Bell., atoms 184 187
> Minimum cell size due to bonded interactions: 0.443 nm
> Maximum distance for 3 constraints, at 120 deg. angles, all-trans: 0.459 nm
> Estimated maximum distance required for P-LINCS: 0.459 nm
>
> ---
> Program: gmx mdrun, version 2019.4
> Source file: src/gromacs/domdec/domdec_setup.cpp (line 764)
> MPI rank:0 (out of 14)
>
> Fatal error:
> The number of ranks you selected (14) contains a large prime factor 7. In
> most
> cases this will lead to bad performance. Choose a number with smaller prime
> factors or set the decomposition (option -dd) manually.
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> ---
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> 

Re: [gmx-users] [Performance] poor performance with NV V100

2019-10-16 Thread Szilárd Páll
Hi,

Please keep the conversation on the mailing list.

GROMACS uses both CPUs and GPUs for computation. Your runs limit core count
per rank, and do so in a way that the rest of the cores are left idle. This
is not a suitable approach for realistic benchmarking due to the clock
boosting skewing your scaling results.

Secondly, you should consider using PME offload as well, see the docs and
previous discussion on the list how to do so.

Last, if you are evaluating hardware for some use-cases, do make sure you
set up your benchmarks such that they reflect the intended use cases (e.g.
scaling vs throughput), and please check out the best practices for how to
run GROMACS on GPU servers.

You might also be interested in a recent study we did:
https://onlinelibrary.wiley.com/doi/full/10.1002/jcc.26011

Cheers,

--
Szilárd


On Tue, Oct 8, 2019 at 3:00 PM Jimmy Chen  wrote:

> Hi Szilard,
>
> Thanks for your help.
> Is md.log enough for you to clarify where the bottleneck is located?
> If you need another log, please let me know.
>
> I just checked the release note of 2019.4, I didn't see any major release
> impact the performance of intra-node.
>
> http://manual.gromacs.org/documentation/2020-beta1/release-notes/2019/2019.4.html
>
> anyway, I will have a try on 2019.4 later.
>
> looking forward to check new feature which will be on 2/3 beta release of
> 2020.
>
> Best regards,
> Jimmy
>
>
> Szilárd Páll  於 2019年10月8日 週二 下午8:34寫道:
>
>> Hi,
>>
>> Can you please share your log files? we may be able to help with spotting
>> performance issues or bottlenecks.
>> However, note that for NVIDIA are the best source to aid you with
>> reproducing their benchmark numbers, we
>>
>> Scaling across multiple GPUs requires some tuning of command line options,
>> please see the related discussion on the list ((briefly: use multiple
>> ranks
>> per GPU, and one separate PME rank with GPU offload).
>>
>> Also note that intra-node strong scaling optimization target of recent
>> releases (there are no p2p optimizations either), however new features
>> going into the 2020 release will improve things significantly. Keep an eye
>> out on the beta2/3 releases if you are interested in checking out the new
>> features.
>>
>> Cheers,
>> --
>> Szilárd
>>
>>
>> On Mon, Oct 7, 2019 at 7:48 AM Jimmy Chen  wrote:
>>
>> > Hi,
>> >
>> > I'm using NV v100 to evaluate if it's suitable to do purchase.
>> > But I can't get similar test result as referenced performance data
>> > which was got from internet.
>> > https://developer.nvidia.com/hpc-application-performance
>> >
>> >
>> https://www.hpc.co.jp/images/pdf/benchmark/Molecular-Dynamics-March-2018.pdf
>> >
>> >
>> > No matter using docker tag 18.02 from
>> > https://ngc.nvidia.com/catalog/containers/hpc:gromacs/tags
>> >
>> > or gromacs source code from
>> > ftp://ftp.gromacs.org/pub/gromacs/gromacs-2019.3.tar.gz
>> >
>> > test data set is ADH dodec and water 1.5M
>> > gmx grompp -f pme_verlet.mdp
>> > gmx mdrun -ntmpi 1 -nb gpu -pin on -v -noconfout -nsteps 5000 -s
>> topol.tpr
>> > -ntomp 4
>> > and  gmx mdrun -ntmpi 2 -nb gpu -pin on -v -noconfout -nsteps 5000 -s
>> > topol.tpr -ntomp 4
>> >
>> > My CPU is Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
>> > and GPU is NV V100 16GB PCIE.
>> >
>> > For ADH dodec,
>> > The perf data of 2xV100 16GB PCIE in
>> > https://developer.nvidia.com/hpc-application-performance is 176
>> (ns/day).
>> > But I only can get 28 (ns/day). actually I can get 67(ns/day) with
>> 1xV100.
>> > I don't know why I got poorer result with 2xV100.
>> >
>> > For water 1.5M
>> > The perf data of 1xV100 16GB PCIE in
>> >
>> >
>> https://www.hpc.co.jp/images/pdf/benchmark/Molecular-Dynamics-March-2018.pdf
>> > is
>> > 9.83(ns/day) and 2xV100 is 10.41(ns/day).
>> > But what I got is 6.5(ns/day) with 1xV100 and 2(ns/day) with 2xV100.
>> >
>> > Could anyone give me some suggestions about how to clarify what's
>> problem
>> > to result to this perf data in my environment? Is my command to perform
>> the
>> > testing wrong? any suggested command to perform the testing?
>> > or which source code version is recommended to use now?
>> >
>> > btw, after checking the code, it seems MPI doesn't go through PCIE P2p
>> or
>> > RDMA, is it correct? any plan to implement this in MPI?
>> >
>> > Best regard

Re: [gmx-users] [Performance] poor performance with NV V100

2019-10-08 Thread Szilárd Páll
Hi,

Can you please share your log files? we may be able to help with spotting
performance issues or bottlenecks.
However, note that for NVIDIA are the best source to aid you with
reproducing their benchmark numbers, we

Scaling across multiple GPUs requires some tuning of command line options,
please see the related discussion on the list ((briefly: use multiple ranks
per GPU, and one separate PME rank with GPU offload).

Also note that intra-node strong scaling optimization target of recent
releases (there are no p2p optimizations either), however new features
going into the 2020 release will improve things significantly. Keep an eye
out on the beta2/3 releases if you are interested in checking out the new
features.

Cheers,
--
Szilárd


On Mon, Oct 7, 2019 at 7:48 AM Jimmy Chen  wrote:

> Hi,
>
> I'm using NV v100 to evaluate if it's suitable to do purchase.
> But I can't get similar test result as referenced performance data
> which was got from internet.
> https://developer.nvidia.com/hpc-application-performance
>
> https://www.hpc.co.jp/images/pdf/benchmark/Molecular-Dynamics-March-2018.pdf
>
>
> No matter using docker tag 18.02 from
> https://ngc.nvidia.com/catalog/containers/hpc:gromacs/tags
>
> or gromacs source code from
> ftp://ftp.gromacs.org/pub/gromacs/gromacs-2019.3.tar.gz
>
> test data set is ADH dodec and water 1.5M
> gmx grompp -f pme_verlet.mdp
> gmx mdrun -ntmpi 1 -nb gpu -pin on -v -noconfout -nsteps 5000 -s topol.tpr
> -ntomp 4
> and  gmx mdrun -ntmpi 2 -nb gpu -pin on -v -noconfout -nsteps 5000 -s
> topol.tpr -ntomp 4
>
> My CPU is Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
> and GPU is NV V100 16GB PCIE.
>
> For ADH dodec,
> The perf data of 2xV100 16GB PCIE in
> https://developer.nvidia.com/hpc-application-performance is 176 (ns/day).
> But I only can get 28 (ns/day). actually I can get 67(ns/day) with 1xV100.
> I don't know why I got poorer result with 2xV100.
>
> For water 1.5M
> The perf data of 1xV100 16GB PCIE in
>
> https://www.hpc.co.jp/images/pdf/benchmark/Molecular-Dynamics-March-2018.pdf
> is
> 9.83(ns/day) and 2xV100 is 10.41(ns/day).
> But what I got is 6.5(ns/day) with 1xV100 and 2(ns/day) with 2xV100.
>
> Could anyone give me some suggestions about how to clarify what's problem
> to result to this perf data in my environment? Is my command to perform the
> testing wrong? any suggested command to perform the testing?
> or which source code version is recommended to use now?
>
> btw, after checking the code, it seems MPI doesn't go through PCIE P2p or
> RDMA, is it correct? any plan to implement this in MPI?
>
> Best regards,
> Jimmy
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] SIMD options - detection program issue

2019-09-20 Thread Szilárd Páll
Hi,

Good to know your system instability issues were resolved.

(As a side-note you could have tried to use elrepo which has newer kernels
for CentOS.)

The SIMD detection should however not be failing; can you please file an
issue on redmine.gromacs.org with cmake invocation, CMakeCache.txt and
CMake logs attached. Also please make sure you use the latest patch
release, that is 2019.3?

Cheers,
--
Szilárd


On Wed, Sep 18, 2019 at 9:17 AM Stefano Guglielmo <
stefano.guglie...@unito.it> wrote:

>  Hi all,
> an update, hopefully the last one, I have been annoying you for too long.
> I decided to replace centOS with Mint 19.2. I compiled Gromacs 2019.2,
> setting -DGMX_SIMD=AUTO resulted again in the same error of compilation of
> detect program (Did not detect build CPU vendor - detection program did not
> compile - Detection for best SIMD instructions failed, using SIMD - None --
> SIMD instructions disabled). So I set manually -DGMX_SIMD to avx2_128 or
> avx2_256. The compilation worked fine and I tried to run the two
> simulations in parallel on the two gpus
> (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> plus
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 11 -pin on -pinoffset 28 -pinstride 1)
> and in both cases the system proved stable without any crash. Maybe the old
> kernel of centos (3.10) was not that smart in managing the cpu (I also
> found some posts on the web regarding issues of threadripper 2990wx and
> centos).
> Still I can not find an explanation of the compilation error of detect
> program.
> Anyway, thanks to all of you for sharing suggestions and opinions,
> Stefano
>
>
> --
> Stefano GUGLIELMO PhD
> Assistant Professor of Medicinal Chemistry
> Department of Drug Science and Technology
> Via P. Giuria 9
> 10125 Turin, ITALY
> ph. +39 (0)11 6707178
>
>
> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail
> >
> Mail
> priva di virus. www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail
> >
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Tesla GPUs: P40 or P100?

2019-09-19 Thread Szilárd Páll
Hi,

I strongly recommend the Quadro RTX series,  6000 or 5000. These should not
be a lot more expensive, but will be a lot faster than the Pascal
generation cards. For comparisons see our recent paper:
https://doi.org/10.1002/jcc.26011

Cheers,
--
Szilárd

On Thu, Sep 19, 2019, 09:50 Matteo Tiberti  wrote:

> Hi all,
>
> we are considering getting a new server, mainly for GROMACS and for other
> CPU-intensive tasks.
>
> Unfortunately we are unable to buy consumer GPUs and we need to get Teslas
> to accelerate GROMACS and possibly for other MD workload in the future.
> Both P40 and P100 fit our budget, and I'd be inclined towards the P40 for
> the slightly better single-precision performance and larger memory. The P40
> has however lower bandwidth and much lower double-precision performance
> respect
> to the P100 (it's more similar to a consumer GPU in this extent), which
> shouldn't matter as far as GROMACS is concerned right now. I've seen some
> talk in the mailing list about implementing mixed/fixed precision modes in
> GROMACS, and for what I gathered it's unlikely to happen anytime soon, so I
> believe the P40 to be a future-proof choice (at least in the short-medium
> term).
>
> This said, I feel like the P40 isn't getting much recognition both in the
> mailing list and in the "bang for your bucks" papers - so my question boils
> down to, is there any reason we should prefer the P100 card over the P40?
>
> Thanks for your help!
>
> Matteo
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Fwd: SIMD options

2019-09-12 Thread Szilárd Páll
On Thu, Sep 12, 2019 at 4:05 PM Stefano Guglielmo
 wrote:
>
> As an update, I have just tried a run with cpu only after compiling with
> AVX2_128 and the workstation turned off after few minutes.

That is suspicious. Perhaps a CPU cooling issue? Otherwise you may
have a BIOS/firmware issue or, less likely but possibly, a a faulty
CPU.

--
Szilárd

> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> Mail
> priva di virus. www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> <#m_360888458910571867_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> Il giorno gio 12 set 2019 alle ore 15:19 Stefano Guglielmo <
> stefano.guglie...@unito.it> ha scritto:
>
> > Hi Szilard,
> > thanks for your reply.
> > The compiler is gcc 4.8.5.
> > I put below the link where you can find the files coming from cmake and
> > the output for "AUTO" SIMD instruction. As for cpu only, as you had
> > suggested previously I tried a run (after compiling with AVX2_256) and it
> > worked without any problems for about 5 hours. I will try with AVX2_128 as
> > well.
> >
> > Stefano
> >
> > https://www.dropbox.com/sh/rh3gdpoxrbqzdfx/AACXoXZ8Zw1-ItD9lnrDr1TNa?dl=0
> >
> >
> > <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> >  Mail
> > priva di virus. www.avast.com
> > <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> > <#m_360888458910571867_m_-3568712886600561053_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >
> > Il giorno gio 12 set 2019 alle ore 14:28 Szilárd Páll <
> > pall.szil...@gmail.com> ha scritto:
> >
> >> On Thu, Sep 12, 2019 at 8:58 AM Stefano Guglielmo
> >>  wrote:
> >> >
> >> > I apologize for the mistake, there was a typo in the object that could
> >> be
> >> > misleading, so I re-post with the correct object,
> >> > sorry.
> >> >
> >> > -- Forwarded message -
> >> > Da: Stefano Guglielmo 
> >> > Date: mer 11 set 2019 alle ore 17:17
> >> > Subject: SMD options
> >> > To: 
> >> >
> >> >
> >> > Hi all,
> >> > following my previous post regarding anomalous crashing of the system on
> >> > running gromacs on two gpus, I have some new elements to add.
> >> > I tested the workstation with two tools for gpu and cpu I found on the
> >> web
> >> > (gpu_burn and stress); I ran the two of them at the same time for two
> >> hours
> >> > pushing both gpus (2 x 250 W) and cpu (250 W), and no error reports or
> >> > overheating have resulted, so I would say that the hardware seems to be
> >> > stable.
> >> > Despite this, I found something that perhaps could be not normal during
> >> > gromacs compilation: setting -DGMX_SIMD=AUTO, results in "SIMD
> >> > instructions:  NONE"; in this condition I can run gromacs without any
> >> > unexpected crash, even though a little less efficiently; setting
> >> -DGMX_SIMD
> >> > to AVX_256, AVX2_128 and AVX2_256 produces clean compilation and
> >> > installation (with all the tests passed), but on running the sudden
> >> turning
> >> > off happens in the conditions I had described in the previous posts
> >> > (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> >> > -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> >> > plus
> >> > gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> >> > -gputasks 11 -pin on -pinoffset 28 -pinstride 1)
> >> > Do you think that there could be a relationship between SIMD options
> >> > setting and the crash of the system?
> >>
> >> Unlikely, but not impossible. I would however expect that a CPU-only
> >> GROMACS run would also lead to a crash. Can you try to do an AVX2_128
> >> CPU-only run (e.g. mdrun -ntmpi 64 -ntomp 1 -nb cpu) and let it run
> >> for a few hours?
> >>
> >> > Does anyone have any idea about the
> >> > reason why gromacs does not seem to automatically recognize any options
> >> for
> >> > my AMD threadripper? Can there be any solutions for this?
> >>
> >> That is certainly unexpected, perhaps there is an issue with your
> >> compler toolchains. What compiler are 

Re: [gmx-users] Fwd: SIMD options

2019-09-12 Thread Szilárd Páll
On Thu, Sep 12, 2019 at 3:25 PM Stefano Guglielmo
 wrote:
>
>  Hi Szilard,
> thanks for your reply.
> The compiler is gcc 4.8.5.

That is ancient. From your cmake.out:

-- Detection for best SIMD instructions failed, using SIMD - None
-- SIMD instructions disabled

Update your compiler toolchain to a modern compiler (e.g. gcc 8),
you'll also get much better performance.

Cheers,
--
Szilárd

> I put below the link where you can find the files coming from cmake and the
> output for "AUTO" SIMD instruction. As for cpu only, as you had suggested
> previously I tried a run (after compiling with AVX2_256) and it worked
> without any problems for about 5 hours. I will try with AVX2_128 as well.
>
> Stefano
>
> https://www.dropbox.com/sh/rh3gdpoxrbqzdfx/AACXoXZ8Zw1-ItD9lnrDr1TNa?dl=0
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> Mail
> priva di virus. www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> Il giorno gio 12 set 2019 alle ore 14:28 Szilárd Páll <
> pall.szil...@gmail.com> ha scritto:
>
> > On Thu, Sep 12, 2019 at 8:58 AM Stefano Guglielmo
> >  wrote:
> > >
> > > I apologize for the mistake, there was a typo in the object that could be
> > > misleading, so I re-post with the correct object,
> > > sorry.
> > >
> > > -- Forwarded message -
> > > Da: Stefano Guglielmo 
> > > Date: mer 11 set 2019 alle ore 17:17
> > > Subject: SMD options
> > > To: 
> > >
> > >
> > > Hi all,
> > > following my previous post regarding anomalous crashing of the system on
> > > running gromacs on two gpus, I have some new elements to add.
> > > I tested the workstation with two tools for gpu and cpu I found on the
> > web
> > > (gpu_burn and stress); I ran the two of them at the same time for two
> > hours
> > > pushing both gpus (2 x 250 W) and cpu (250 W), and no error reports or
> > > overheating have resulted, so I would say that the hardware seems to be
> > > stable.
> > > Despite this, I found something that perhaps could be not normal during
> > > gromacs compilation: setting -DGMX_SIMD=AUTO, results in "SIMD
> > > instructions:  NONE"; in this condition I can run gromacs without any
> > > unexpected crash, even though a little less efficiently; setting
> > -DGMX_SIMD
> > > to AVX_256, AVX2_128 and AVX2_256 produces clean compilation and
> > > installation (with all the tests passed), but on running the sudden
> > turning
> > > off happens in the conditions I had described in the previous posts
> > > (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> > > plus
> > > gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 11 -pin on -pinoffset 28 -pinstride 1)
> > > Do you think that there could be a relationship between SIMD options
> > > setting and the crash of the system?
> >
> > Unlikely, but not impossible. I would however expect that a CPU-only
> > GROMACS run would also lead to a crash. Can you try to do an AVX2_128
> > CPU-only run (e.g. mdrun -ntmpi 64 -ntomp 1 -nb cpu) and let it run
> > for a few hours?
> >
> > > Does anyone have any idea about the
> > > reason why gromacs does not seem to automatically recognize any options
> > for
> > > my AMD threadripper? Can there be any solutions for this?
> >
> > That is certainly unexpected, perhaps there is an issue with your
> > compler toolchains. What compiler are you using? Can you please share
> > your cmake detection output and CMakeCache.txt?
> >
> > Cheers,
> > --
> > Szilárd
> >
> > > Thanks again
> > > Stefano
> > > PS: the workstation is running with centOS 7 and aThreadripper 2990WX
> > cpu.
> > >
> > > --
> > > Stefano GUGLIELMO PhD
> > > Assistant Professor of Medicinal Chemistry
> > > Department of Drug Science and Technology
> > > Via P. Giuria 9
> > > 10125 Turin, ITALY
> > > ph. +39 (0)11 6707178
> > >
> > >
> > > <
> > https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail
> > >
> > > Mail
> > > priva di virus. www.avast.com
> > > <
> > https://www.avast.com/sig-email?utm_medium=emai

Re: [gmx-users] Fwd: SIMD options

2019-09-12 Thread Szilárd Páll
On Thu, Sep 12, 2019 at 8:58 AM Stefano Guglielmo
 wrote:
>
> I apologize for the mistake, there was a typo in the object that could be
> misleading, so I re-post with the correct object,
> sorry.
>
> -- Forwarded message -
> Da: Stefano Guglielmo 
> Date: mer 11 set 2019 alle ore 17:17
> Subject: SMD options
> To: 
>
>
> Hi all,
> following my previous post regarding anomalous crashing of the system on
> running gromacs on two gpus, I have some new elements to add.
> I tested the workstation with two tools for gpu and cpu I found on the web
> (gpu_burn and stress); I ran the two of them at the same time for two hours
> pushing both gpus (2 x 250 W) and cpu (250 W), and no error reports or
> overheating have resulted, so I would say that the hardware seems to be
> stable.
> Despite this, I found something that perhaps could be not normal during
> gromacs compilation: setting -DGMX_SIMD=AUTO, results in "SIMD
> instructions:  NONE"; in this condition I can run gromacs without any
> unexpected crash, even though a little less efficiently; setting -DGMX_SIMD
> to AVX_256, AVX2_128 and AVX2_256 produces clean compilation and
> installation (with all the tests passed), but on running the sudden turning
> off happens in the conditions I had described in the previous posts
> (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> plus
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 11 -pin on -pinoffset 28 -pinstride 1)
> Do you think that there could be a relationship between SIMD options
> setting and the crash of the system?

Unlikely, but not impossible. I would however expect that a CPU-only
GROMACS run would also lead to a crash. Can you try to do an AVX2_128
CPU-only run (e.g. mdrun -ntmpi 64 -ntomp 1 -nb cpu) and let it run
for a few hours?

> Does anyone have any idea about the
> reason why gromacs does not seem to automatically recognize any options for
> my AMD threadripper? Can there be any solutions for this?

That is certainly unexpected, perhaps there is an issue with your
compler toolchains. What compiler are you using? Can you please share
your cmake detection output and CMakeCache.txt?

Cheers,
--
Szilárd

> Thanks again
> Stefano
> PS: the workstation is running with centOS 7 and aThreadripper 2990WX cpu.
>
> --
> Stefano GUGLIELMO PhD
> Assistant Professor of Medicinal Chemistry
> Department of Drug Science and Technology
> Via P. Giuria 9
> 10125 Turin, ITALY
> ph. +39 (0)11 6707178
>
>
> 
> Mail
> priva di virus. www.avast.com
> 
> <#m_-7114857415897320041_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
> --
> Stefano GUGLIELMO PhD
> Assistant Professor of Medicinal Chemistry
> Department of Drug Science and Technology
> Via P. Giuria 9
> 10125 Turin, ITALY
> ph. +39 (0)11 6707178
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] How can Build gromacs using MSVC on Win64 with AVX2?

2019-09-12 Thread Szilárd Páll
On Thu, Sep 12, 2019 at 9:29 AM Tatsuro MATSUOKA  wrote:
>
> >Those are kernels for legacy code that never use such simd anywhere
> Doy you mean that gmxSimdFlags.cmake is not used for simd detection ?

gmxSimdFlags.cmake detects the _flags_ necessary for a SIMD build. It
is gmxDetectSimd.cmake / gmxDetectCpu.cmake that do the CPU detection.

All CPU/SIMD detection is orchestrated from gmxManageSimd.cmake where,
first the target SIMD is detected here:
https://redmine.gromacs.org/projects/gromacs/repository/revisions/master/entry/cmake/gmxManageSimd.cmake#L89
then the corresponding require compiler flags are detected.

Let us know if you have further questions!

Cheers,
--
Szilárd

>
> Tatsuro
>
>
>
> - Original Message -
> >From: Mark Abraham 
> >To: Discussion list for GROMACS users ; Tatsuro 
> >MATSUOKA 
> >Date: 2019/9/12, Thu 14:39
> >Subject: Re: [gmx-users] How can Build gromacs using MSVC on Win64 with AVX2?
> >
> >
> >Hi,
> >
> >
> >Those are kernels for legacy code that never use such simd anywhere
> >
> >
> >Mark
> >
> >On Thu., 12 Sep. 2019, 07:16 Tatsuro MATSUOKA,  
> >wrote:
> >
> >On GROMACS 2019.3, GROMACS cannot be built with AVX2.
> >>
> >>In gmxSimdFlags.cmake
> >>
> >>SIMD_AVX2_C_FLAGS SIMD_AVX2_CXX_FLAGS
> >>"${TOOLCHAIN_FLAG_FOR_AVX2}" "-mavx2" "/arch:AVX" "-hgnu") # no 
> >> AVX2-specific flag for MSVC yet
> >>
> >>If I modify the above, "/arch:AVX" => "/arch:AVX" ,
> >>
> >>-- Performing Test C_mavx2_mfma_FLAG_ACCEPTED
> >>-- Performing Test C_mavx2_mfma_FLAG_ACCEPTED - Failed
> >>-- Performing Test C_mavx2_FLAG_ACCEPTED
> >>-- Performing Test C_mavx2_FLAG_ACCEPTED - Failed
> >>-- Performing Test C_arch_AVX2_FLAG_ACCEPTED
> >>-- Performing Test C_arch_AVX2_FLAG_ACCEPTED - Success
> >>-- Performing Test C_arch_AVX2_COMPILE_WORKS
> >>-- Performing Test C_arch_AVX2_COMPILE_WORKS - Success
> >>-- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED
> >>-- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED - Failed
> >>-- Performing Test CXX_mavx2_FLAG_ACCEPTED
> >>-- Performing Test CXX_mavx2_FLAG_ACCEPTED - Failed
> >>-- Performing Test CXX_arch_AVX2_FLAG_ACCEPTED
> >>-- Performing Test CXX_arch_AVX2_FLAG_ACCEPTED - Success
> >>-- Performing Test CXX_arch_AVX2_COMPILE_WORKS
> >>-- Performing Test CXX_arch_AVX2_COMPILE_WORKS - Success
> >>-- Enabling 256-bit AVX2 SIMD instructions using CXX flags:  /arch:AVX2
> >>
> >>
> >>But avx components
> >>
> >> nb_kernel_ElecCSTab_VdwCSTab_GeomP1P1_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwCSTab_GeomW3P1_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwCSTab_GeomW3W3_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwCSTab_GeomW4P1_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwCSTab_GeomW4W4_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwLJ_GeomP1P1_avx_256_single.cpp
> >> nb_kernel_ElecCSTab_VdwLJ_GeomW3P1_avx_256_single.cpp
> >>are complied.
> >>
> >>How can I compile "avx2_256" files?
> >>
> >>Tatsuro
> >>
> >>--
> >>Gromacs Users mailing list
> >>
> >>* Please search the archive at 
> >>http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> >>
> >>* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >>* For (un)subscribe requests visit
> >>https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send 
> >>a mail to gmx-users-requ...@gromacs.org.
> >
> >
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] gromacs binaries for windows (Cygwin 64)

2019-09-10 Thread Szilárd Páll
Dear Tatsuro,

Thanks for the contributions!

Do the builds work out cleanly on cygwin? Are there any additional
instructions we should consider including in our installation guide?

Cheers,
--
Szilárd

On Fri, Sep 6, 2019 at 5:46 AM Tatsuro MATSUOKA  wrote:
>
> I have prepared gromacs binaries for windows (Cygwin 64) on my own web site.
> (For testing purpose.)
>
> http://tmacchant3.starfree.jp/gromacs/win/
>
> Tatsuro
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Cannot run short-ranged nonbonded interactions on a GPU

2019-09-09 Thread Szilárd Páll
Hi,

What does the log file detection output contain? You might have linked
against a CUDA release not compatible with your drivers (e.g. too
recent).

Cheers,
--
Szilárd

On Sun, Sep 8, 2019 at 5:17 PM Mahmood Naderan  wrote:
>
> Hi
> With the following config command
> cmake .. -DGMX_GPU=on -DCMAKE_INSTALL_PREFIX=`pwd`/../single 
> -DGMX_BUILD_OWN_FFTW=ON
> I get the following error for "gmx mdrun -nb gpu -v -deffnm inp_nvp"
> Fatal error:
> Cannot run short-ranged nonbonded interactions on a GPU because there is none
> detected.
>
>
>
> deviceQuery command shows that GPU is detected.
> $ ./deviceQuery
> ./deviceQuery Starting...
>
>  CUDA Device Query (Runtime API) version (CUDART static linking)
>
> Detected 1 CUDA Capable device(s)
>
> Device 0: "GeForce GTX 1080 Ti"
> ...
>
>
> With the same input, I haven't seen that error before. Did I miss something?
>
> Regards,
> Mahmood
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] simulation on 2 gpus

2019-09-06 Thread Szilárd Páll
On Fri, Sep 6, 2019 at 3:47 PM Stefano Guglielmo
 wrote:
>
> Hi Szilard,
>
> thanks for suggestions.
>
>
> As for the strange crash, the workstation works fine using only cpu; the
> problem seems to be related to gpu usage, when both cards are used for 200
> W over 250 (more or less) the workstation turns off. It is not about PSU
> (even in the "offending" case we are quite below the maximum power),

How far below? Note that PSU efficiency and quality does also affect
stability at high load.

> and it
> is neither related to temperature (it happens even if gpu temp is around
> 55-60 °C). The vendor did some tests and accordingly the hardware seems to
> be ok. Do you (or anyone else in the list) have any particular test to
> suggest that can more specifically help to diagnose the problem?

I suggest the following for load testing:
https://github.com/ComputationalRadiationPhysics/cuda_memtest
and for memory stress testing:
https://github.com/ComputationalRadiationPhysics/cuda_memtest

Cheers,
--

Szilárd

>
> Any opinion is appreciated,
>
> thanks
>
> Il giorno mercoledì 21 agosto 2019, Szilárd Páll 
> ha scritto:
>
> > Hi Stefano,
> >
> >
> > On Tue, Aug 20, 2019 at 3:29 PM Stefano Guglielmo
> >  wrote:
> > >
> > > Dear Szilard,
> > >
> > > thanks for the very clear answer.
> > > Following your suggestion I tried to run without DD; for the same system
> > I
> > > run two simulations on two gpus:
> > >
> > > gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> > >
> > > gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 11 -pin on -pinoffset 28 -pinstride 1
> > >
> > > but again the system crashed; with this I mean that after few minutes the
> > > machine goes off (power off) without any error message, even without
> > using
> > > all the threads.
> >
> > That is not normal and I strongly recommend investigating it as it
> > could be a sign of an underlying system/hardware instability or fault
> > which could ultimately lead to incorrect simulation results.
> >
> > Are you sure that:
> > - your machine is stable and reliable at high loads; is the PSU sufficient?
> > - your hardware has been thoroughly stress-tested and it does not show
> > instabilities?
> >
> > Does the crash also happen with GROMACS running on the CPU only (using
> > all cores)?
> > I'd recommend running some stress-tests that fully load the machine
> > for a few hours to see if the error persists.
> >
> > > I then tried running the two simulations on the same gpu without DD:
> > >
> > > gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 00 -pin on -pinoffset 0 -pinstride 1
> > >
> > > gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> > > -gputasks 00 -pin on -pinoffset 28 -pinstride 1
> > >
> > > and I obtained better performance (about 70 ns/day) with a massive use of
> > > the gpu (around 90%), comparing to the two runs on two gpus I reported in
> > > the previous post
> > > (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1
> > -gputasks
> > > 000 -pin on -pinoffset 0 -pinstride 1
> > >  gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1
> > > -gputasks 111 -pin on -pinoffset 28 -pinstride 1).
> >
> > That is expected; domain-decomposition on a single GPU is unnecessary
> > and introduces overheads that limit performance.
> >
> > > As for pinning, cpu topology according to log file is:
> > > hardware topology: Basic
> > > Sockets, cores, and logical processors:
> > >   Socket  0: [   0  32] [   1  33] [   2  34] [   3  35] [   4  36] [
> > > 5  37] [   6  38] [   7  39] [  16  48] [  17  49] [  18  50] [  19  51]
> > [
> > >  20  52] [  21  53] [  22  54] [  23  55] [   8  40] [   9  41] [  10
> > 42]
> > > [  11  43] [  12  44] [  13  45] [  14  46] [  15  47] [  24  56] [  25
> > >  57] [  26  58] [  27  59] [  28  60] [  29  61] [  30  62] [  31  63]
> > > If I understand well (absolutely not sure) it should not be that
> > convenient
> > > to pin to consecutive threads,
> >
> > On the contrary, pinning to consecutive threads is the recommended
> > behavior. More generally, application threads are expected to be
> > pinned to consecutive cores (as threading parallelization will benefit
> >

Re: [gmx-users] The problem of utilizing multiple GPU

2019-09-05 Thread Szilárd Páll
Hi,

You have 2x Xeon Gold 6150 which is 2x 18 = 36 cores; Intel CPUs
support 2 threads/core (HyperThreading), hence the 72.
https://ark.intel.com/content/www/us/en/ark/products/120490/intel-xeon-gold-6150-processor-24-75m-cache-2-70-ghz.html

You will not be able to scale efficiently over 8 GPUs in a single
simulation with the current code; while performance will likely
improve in the next release, due to PCI bus and PME scaling
limitations, even with GROMACS 2020 it is unlikely you will see much
benefit beyond 4 GPUs.

Try running on 3-4 GPUs with at least 2 ranks on each, and one
separate PME rank. You might also want to use every second GPU rather
than the first four to avoid overloading the PCI bus; e.g.
gmx mdrun -ntmpi 7 -npme 1 -nb gpu -pme gpu -bonded gpu -gpuid 0,2,4,6
-gputask 001122334

Cheers,
--
Szilárd

On Thu, Sep 5, 2019 at 1:12 AM 孙业平  wrote:
>
> Hello Mark Abraham,
>
> Thank you very much for your reply. I will definitely check the webinar and 
> gromacs document. But now I am confused and expect an direct solution. The 
> workstation should have 18 cores each with 4 hyperthreads. The output of 
> "lscpu" reads:
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):72
> On-line CPU(s) list:   0-71
> Thread(s) per core:2
> Core(s) per socket:18
> Socket(s): 2
> NUMA node(s):  2
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 85
> Model name:Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz
> Stepping:  4
> CPU MHz:   2701.000
> CPU max MHz:   2701.
> CPU min MHz:   1200.
> BogoMIPS:  5400.00
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  1024K
> L3 cache:  25344K
> NUMA node0 CPU(s): 0-17,36-53
> NUMA node1 CPU(s): 18-35,54-71
>
> Now I don't want to do multiple simulations and just want to run a single 
> simulation. When assigning the simulation to only one GPU (gmx mdrun -v 
> -gpu_id 0 -deffnm md), the simulation performance is 90 ns/day. However, when 
> I don't assign the GPU but let all GPU work by:
>gmx mdrun -v -deffnm md
> The simulation performance is only 2 ns/day.
>
> So what is correct command to make a full use of all GPUs and achieve the 
> best performance (which I expect should be much higher than 90 ns/day with 
> only one GPU)? Could you give me further suggestions and help?
>
> Best regards,
> Yeping
>
> --
> From:Mark Abraham 
> Sent At:2019 Sep. 4 (Wed.) 19:10
> To:gromacs ; 孙业平 
> Cc:gromacs.org_gmx-users 
> Subject:Re: [gmx-users] The problem of utilizing multiple GPU
>
> Hi,
>
>
> On Wed, 4 Sep 2019 at 12:54, sunyeping  wrote:
> Dear everyone,
>
>  I am trying to do simulation with a workstation with 72 core and 8 geforce 
> 1080 GPUs.
>
> 72 cores, or just 36 cores each with two hyperthreads? (it matters because 
> you might not want to share cores between simulations, which is what you'd 
> get if you just assigned 9 hyperthreads per GPU and 1 GPU per simulation).
>
>  When I do not assign a certain GPU with the command:
>gmx mdrun -v -deffnm md
>  all GPUs are used and but the utilization of each GPU is extremely low (only 
> 1-2 %), and the simulation will be finished after several months.
>
> Yep. Too many workers for not enough work means everyone spends time more 
> time coordinating than working. This is likely to improve in GROMACS 2020 
> (beta out shortly).
>
>  In contrast, when I assign the simulation task to only one GPU:
>  gmx mdrun -v -gpu_id 0 -deffnm md
>  the GPU utilization can reach 60-70%, and the simulation can be finished 
> within a week. Even when I use only two GPU:
>
> Utilization is only a proxy - what you actually want to measure is the rate 
> of simulation ie. ns/day.
>
>   gmx mdrun -v -gpu_id 0,2 -deffnm md
>
>  the GPU utilizations are very low and the simulation is very slow.
>
> That could be for a variety of reasons, which you could diagnose by looking 
> at the performance report at the end of the log file, and comparing different 
> runs.
>  I think I may missuse the GPU for gromacs simulation. Could you tell me what 
> is the correct way to use multiple GPUs?
>
> If you're happy running multiple simulations, then the easiest thing to do is 
> to use the existing multi-simulation support to do
>
> mpirun -np 8 gmx_mpi -multidir dir0 dir1 dir2 ... dir7
>
> and let mdrun handle the details. Otherwise you have to get involved in 
> assigning a subset of the CPU cores and GPUs to each job that both runs fast 
> and does not conflict. See the documentation for GROMACS for the version 
> you're running e.g. 
> http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html#running-mdrun-within-a-single-node.
>
> You probably want to 

Re: [gmx-users] simulation on 2 gpus

2019-08-21 Thread Szilárd Páll
Hi Stefano,


On Tue, Aug 20, 2019 at 3:29 PM Stefano Guglielmo
 wrote:
>
> Dear Szilard,
>
> thanks for the very clear answer.
> Following your suggestion I tried to run without DD; for the same system I
> run two simulations on two gpus:
>
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 00 -pin on -pinoffset 0 -pinstride 1
>
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 11 -pin on -pinoffset 28 -pinstride 1
>
> but again the system crashed; with this I mean that after few minutes the
> machine goes off (power off) without any error message, even without using
> all the threads.

That is not normal and I strongly recommend investigating it as it
could be a sign of an underlying system/hardware instability or fault
which could ultimately lead to incorrect simulation results.

Are you sure that:
- your machine is stable and reliable at high loads; is the PSU sufficient?
- your hardware has been thoroughly stress-tested and it does not show
instabilities?

Does the crash also happen with GROMACS running on the CPU only (using
all cores)?
I'd recommend running some stress-tests that fully load the machine
for a few hours to see if the error persists.

> I then tried running the two simulations on the same gpu without DD:
>
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 00 -pin on -pinoffset 0 -pinstride 1
>
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
> -gputasks 00 -pin on -pinoffset 28 -pinstride 1
>
> and I obtained better performance (about 70 ns/day) with a massive use of
> the gpu (around 90%), comparing to the two runs on two gpus I reported in
> the previous post
> (gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
> 000 -pin on -pinoffset 0 -pinstride 1
>  gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1
> -gputasks 111 -pin on -pinoffset 28 -pinstride 1).

That is expected; domain-decomposition on a single GPU is unnecessary
and introduces overheads that limit performance.

> As for pinning, cpu topology according to log file is:
> hardware topology: Basic
> Sockets, cores, and logical processors:
>   Socket  0: [   0  32] [   1  33] [   2  34] [   3  35] [   4  36] [
> 5  37] [   6  38] [   7  39] [  16  48] [  17  49] [  18  50] [  19  51] [
>  20  52] [  21  53] [  22  54] [  23  55] [   8  40] [   9  41] [  10  42]
> [  11  43] [  12  44] [  13  45] [  14  46] [  15  47] [  24  56] [  25
>  57] [  26  58] [  27  59] [  28  60] [  29  61] [  30  62] [  31  63]
> If I understand well (absolutely not sure) it should not be that convenient
> to pin to consecutive threads,

On the contrary, pinning to consecutive threads is the recommended
behavior. More generally, application threads are expected to be
pinned to consecutive cores (as threading parallelization will benefit
from the resulting cache access patterns); now, CPU cores can have
multiple hardware threads and depending on whether using one or
mulitpole makes sense (performance-wise), will determine whether a
stride of 1 or 2 is best. Typically, when most work is offloaded to a
GPU and many CPU cores are available 1 thread/core is best.

Note that the above topology mapping simply means that the indexed
entities that the operating system calls "CPU" grouped in "[]"
correspond to hardware threads of the same core, i.e. core 0 is [0
32], core 1 [1 33], etc. Pinning with a stride happens into this map:
- with a -pinstride 1 thread mapping will be (app thread->hardware
thread): 0->0, 1->32, 2->1, 3->33,...
- with a -pinstride 2 thread mapping will be (-||-): 0->0, 1->1, 2->2, 3->3, ...

> and indeed I found a subtle degradation of
> performance for a single simulation, switching from:
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0 -gputasks
> 00 -pin on
> to
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0 -gputasks
> 00 -pin on -pinoffset 0 -pinstride 1.

If you compare the log files of the two, you should notice that the
former used a pinstride 2 resulting in the use 28 cores while the
latter using only 14 cores; the likely reason for only a small
difference is that there is not enough CPU work to scale to 28 cores
and additionally, these specific TR CPUs are tricky to scale across
using wide multi-threaded parallelization.

Cheers,
--
Szilárd


>
> Thanks again
> Stefano
>
>
>
>
> Il giorno ven 16 ago 2019 alle ore 17:48 Szilárd Páll <
> pall.szil...@gmail.com> ha scritto:
>
> > On Mon, Aug 5, 2019 at 5:00 PM Stefano Guglielmo
> >  wrote:
> > >
> > > Dear Paul,
> > > thanks for suggestions. Following them I managed to run 91 ns/day for the
> > > syste

Re: [gmx-users] gpu usage

2019-08-21 Thread Szilárd Páll
Hi Paul,

Please post log files, otherwise we can only guess what is limiting
the GPU utilization. Otherwise, you should be seeing considerably
higher utilization in single-GPU no-decomposition runs.

Cheers,
--
Szilárd

On Tue, Aug 20, 2019 at 7:01 PM p buscemi  wrote:
>
>
> Dear Users,
> I am getting reasonable performance from two rtx -2080ti's - AMD 32 core and 
> on another node two gtx-1080 ti's -AMD 16 core i.e 20-30 ns/day with 30 
> atoms. But in all my runs the % usage of the gpu's is typcially 40% to 60 % . 
> Given that it is specialized software, I notice that Schrodinger will run a 
> single gpu at 98%.. So the cards are apparently not defective
> The cpu runs at 2.9 Ghz. and the power supply a 1500 watts
> A typical run command might be" gmx mdrun -deffnm sys.npt -nb gpu -pme gpu 
> -ntmpi 8 -ntomp 8 -npme 1
> I have gone over 
> http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html
>  
> (https://link.getmailspring.com/link/1566319015.local-8bb7ace6-bf71-v1.5.3-420ce...@getmailspring.com/0?redirect=http%3A%2F%2Fmanual.gromacs.org%2Fdocumentation%2Fcurrent%2Fuser-guide%2Fmdrun-performance.html=Z214LXVzZXJzQGdyb21hY3Mub3Jn)
>  ,and tried to incorporate what I could.
>
> The installation was basically that given in the manual for build 2019.1:
> cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=on 
> -DCMAKE_CXX_COMPILER=/usr/bin/g++-6 -DCMAKE_C_COMPILER=/usr/bin/gcc-6
> Both 2019.1 and 2019.3 run well but with the same "reduced" % workload.
> I am curious to learn why the gpu's are not pushed a littler harder. Or is 
> this a typical result ? Or are there improvments to make in my setup.
> Paul
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] AVX_512 and GROMACS

2019-08-20 Thread Szilárd Páll
On Mon, Aug 19, 2019 at 12:00 PM tarzan p  wrote:
>
> Hi all.I have a dual socket Xeon GOLD 6148 which has the capabilities for
>
>  Instruction Set Extensions   Intel® SSE4.2, Intel® AVX, Intel® AVX2, Intel® 
> AVX-512
> but hen why si gromacs giving the error for AVX_512 but takes AVX2_256???
> Adding the out put
>
>
> /Downloads/gromacs-2019/build-gromacs$ cmake .. -DGMX_SIMD=AVX_512-- Enabling 
> RDTSCP support
> -- Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
> -- Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
> -- Performing Test C_xCORE_AVX512_FLAG_ACCEPTED
> -- Performing Test C_xCORE_AVX512_FLAG_ACCEPTED - Failed
> -- Performing Test C_mavx512f_mfma_FLAG_ACCEPTED
> -- Performing Test C_mavx512f_mfma_FLAG_ACCEPTED - Failed
> -- Performing Test C_mavx512f_FLAG_ACCEPTED
> -- Performing Test C_mavx512f_FLAG_ACCEPTED - Failed
> -- Performing Test C_arch_AVX_FLAG_ACCEPTED
> -- Performing Test C_arch_AVX_FLAG_ACCEPTED - Failed
> -- Performing Test C_hgnu_FLAG_ACCEPTED
> -- Performing Test C_hgnu_FLAG_ACCEPTED - Failed
> -- Performing Test C_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS
> -- Performing Test C_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS - Failed
> -- Could not find any flag to build test source (this could be due to either 
> the compiler or binutils)
> -- Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
> -- Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED
> -- Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED
> -- Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_mavx512f_FLAG_ACCEPTED
> -- Performing Test CXX_mavx512f_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_arch_AVX_FLAG_ACCEPTED
> -- Performing Test CXX_arch_AVX_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_hgnu_FLAG_ACCEPTED
> -- Performing Test CXX_hgnu_FLAG_ACCEPTED - Failed
> -- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS
> -- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS - Failed
> -- Could not find any flag to build test source (this could be due to either 
> the compiler or binutils)
> CMake Error at cmake/gmxManageSimd.cmake:51 (message):
>   Cannot find AVX 512F compiler flag.  Use a newer compiler, or choose a
>   lower level of SIMD (slower).

The reason is right here. Get newer compiler, I assume you have a
dated distribution default compiler.

--
Szilárd

> Call Stack (most recent call first):
>   cmake/gmxManageSimd.cmake:186 
> (gmx_give_fatal_error_when_simd_support_not_found)
>   CMakeLists.txt:719 (gmx_manage_simd)
>
>
> -- Configuring incomplete, errors occurred!
> See also 
> "/home/gdd/Downloads/gromacs-2019/build-gromacs/CMakeFiles/CMakeOutput.log".
> See also 
> "/home/gdd/Downloads/gromacs-2019/build-gromacs/CMakeFiles/CMakeError.log".
>
>
> ~/Downloads/gromacs-2019/build-gromacs$ cmake .. -DGMX_SIMD=AVX2_256-- 
> Performing Test C_mavx2_mfma_FLAG_ACCEPTED
> -- Performing Test C_mavx2_mfma_FLAG_ACCEPTED - Success
> -- Performing Test C_mavx2_mfma_COMPILE_WORKS
> -- Performing Test C_mavx2_mfma_COMPILE_WORKS - Success
> -- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED
> -- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED - Success
> -- Performing Test CXX_mavx2_mfma_COMPILE_WORKS
> -- Performing Test CXX_mavx2_mfma_COMPILE_WORKS - Success
> -- Enabling 256-bit AVX2 SIMD instructions using CXX flags:  -mavx2 -mfma
> -- Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED
> -- Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED - Success
> -- The GROMACS-managed build of FFTW 3 will configure with the following 
> optimizations: --enable-sse2;--enable-avx;--enable-avx2
> -- Configuring done
> -- Generating done
> -- Build files have been written to: 
> /home/gdd/Downloads/gromacs-2019/build-gromacs
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] simulation on 2 gpus

2019-08-16 Thread Szilárd Páll
On Mon, Aug 5, 2019 at 5:00 PM Stefano Guglielmo
 wrote:
>
> Dear Paul,
> thanks for suggestions. Following them I managed to run 91 ns/day for the
> system I referred to in my previous post with the configuration:
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
> 111 -pin on (still 28 threads seems to be the best choice)
>
> and 56 ns/day for two independent runs:
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
> 000 -pin on -pinoffset 0 -pinstride 1
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
> 111 -pin on -pinoffset 28 -pinstride 1
> which is a fairly good result.

Use no DD in single-GPU runs, i.e. for the latter, just simply
gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 28 -ntmpi 1 -npme 0
-gputasks 00 -pin on -pinoffset 0 -pinstride 1

You can also have mdrun's multidir functionality manage an ensemble of
jobs (related or not) so you don't have to manually start, calculate
pinning, etc.


> I am still wondering if somehow I should pin the threads in some different
> way in order to reflect the cpu topology and if this can influence
> performance (if I remember well NAMD allows the user to indicate explicitly
> the cpu core/threads to use in a computation).

Your pinning does reflect the CPU topology -- the 4x7=28 threads are
pinned to consecutive hardware threads (because of -pinstride 1, i.e.
don't skip the second hardware thread of the core). The mapping of
software to hardware threads happens based on a the topology-based
hardware thread indexing, see the hardware detection report in the log
file.

> When I tried to run two simulations with the following configuration:
> gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 8 -npme 1 -gputasks
>  -pin on -pinoffset 0 -pinstride 1
> gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 8 -npme 1 -gputasks
>  -pin on -pinoffset 0 -pinstride 32
> the system crashed down. Probably this is normal and I am missing something
> quite obvious.

Not really. What do you mean by "crashed down", the machine should not
crash, nor should the simulation. Even though your machine has 32
cores / 64 threads, using all of these may not always be beneficial as
using more threads where there is too little work to scale will have
an overhead. Have you tried using all cores but only 1 thread / core
(i.e. 32 threads in total with pinstride 2)?

Cheers,
--
Szilárd

>
> Thanks again for the valuable advices
> Stefano
>
>
>
> Il giorno dom 4 ago 2019 alle ore 01:40 paul buscemi  ha
> scritto:
>
> > Stefano,
> >
> > A recent run with 14 atoms, including 1 isopropanol  molecules on
> > top of  an end restrained PDMS surface of  74000 atoms  in a 20 20 30 nm
> > box ran at 67 ns/d nvt with the mdrun conditions I posted. It took 120 ns
> > for 100 molecules of an adsorbate  to go from solution to the surface.   I
> > don't think this will set the world ablaze with any benchmarks but it is
> > acceptable to get some work done.
> >
> > Linux Mint Mate 18, AMD Threadripper 32 core 2990wx 4.2Ghz, 32GB DDR4, 2x
> > RTX 2080TI gmx2019 in the simplest gmx configuration for gpus,  CUDA
> > version 10, Nvidia 410.7p loaded  from the repository
> >
> > Paul
> >
> > > On Aug 3, 2019, at 12:58 PM, paul buscemi  wrote:
> > >
> > > Stefano,
> > >
> > > Here is a typical run
> > >
> > > fpr minimization mdrun -deffnm   grofile. -nn gpu
> > >
> > > and for other runs for a 32 core
> > >
> > > gmx -deffnm grofile.nvt  -nb gpu -pme gpu -ntomp  8  -ntmpi 8  -npme 1
> > -gputasks   -pin on
> > >
> > > Depending on the molecular system/model   -ntomp -4 -ntmpi 16  may be
> > faster   - of course adjusting -gputasks
> > >
> > > Rarely do I find that not using ntomp and ntpmi is faster, but it is
> > never bad
> > >
> > > Let me know how it goes.
> > >
> > > Paul
> > >
> > >> On Aug 3, 2019, at 4:41 AM, Stefano Guglielmo <
> > stefano.guglie...@unito.it> wrote:
> > >>
> > >> Hi Paul,
> > >> thanks for the reply. Would you mind posting the command you used or
> > >> telling how did you balance the work between cpu and gpu?
> > >>
> > >> What about pinning? Does anyone know how to deal with a cpu topology
> > like
> > >> the one reported in my previous post and if it is relevant for
> > performance?
> > >> Thanks
> > >> Stefano
> > >>
> > >> Il giorno sabato 3 agosto 2019, Paul Buscemi  ha
> > scritto:
> > >>
> > >>> I run the same system and setup but no nvlink. Maestro runs both gpus
> > at
> > >>> 100 percent. Gromacs typically 50 --60 percent can do 600ns/d on 2
> > >>> atoms
> > >>>
> > >>> PB
> > >>>
> >  On Jul 25, 2019, at 9:30 PM, Kevin Boyd  wrote:
> > 
> >  Hi,
> > 
> >  I've done a lot of research/experimentation on this, so I can maybe
> > get
> > >>> you
> >  started - if anyone has any questions about the essay to follow, feel
> > >>> free
> >  to email me personally, and I'll link it to the email thread if it
> 

Re: [gmx-users] best performance on GPU

2019-08-12 Thread Szilárd Páll
Hi,

You can get significantly better performance if you use a more recent
GROMACS version (>=2018) to pick up the improvements to GPU
acceleration (see
https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.26011 Fig 7, top
group of bars), but 300 ns/day on a single machine is unlikely with
your system and settings.

Cheers,
--
Szilárd


On Fri, Aug 2, 2019 at 12:05 AM Maryam  wrote:
>
> Dear all
> I want to run a simulation of approximately 12000 atoms system in gromacs
> 2016.6 on GPU with the following machine structure:
> Precision: single Memory model: 64 bit MPI library: thread_mpi OpenMP
> support: enabled (GMX_OPENMP_MAX_THREADS = 32) GPU support: CUDA SIMD
> instructions: AVX2_256 FFT library:
> fftw-3.3.5-fma-sse2-avx-avx2-avx2_128-avx512 RDTSCP usage: enabled TNG
> support: enabled Hwloc support: disabled Tracing support: disabled Built
> on: Fri Jun 21 09:58:11 EDT 2019 Built by: julian@BioServer [CMAKE] Build
> OS/arch: Linux 4.15.0-52-generic x86_64 Build CPU vendor: AMD Build CPU
> brand: AMD Ryzen 7 1800X Eight-Core Processor Build CPU family: 23 Model: 1
> Stepping: 1
> Number of GPUs detected: 1 #0: NVIDIA GeForce RTX 2080 Ti, compute cap.:
> 7.5, ECC: no, stat: compatible
> i used different commands to get the best performance and i dont know which
> point i am missing. the quickest time possible is got by this command:gmx
> mdrun -s md.tpr -nb gpu -deffnm MD -tunepme -v
> which is 10 ns/day! and it takes 2 months to end.
>  though i used several commands to tune it like: gmx mdrun -ntomp 6 -pin on
> -resethway -nstlist 20 -s md.tpr -deffnm md -cpi md.cpt -tunepme -cpt 15
> -append -gpu_id 0 -nb auto.  In the gromacs website it is mentioned that
> with this properties I should be able to run it in  295 ns/day!
> could you help me find out what point i am missing that i can not reach the
> best performance level?
> Thank you
> --
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Is it possible to control GPU utilizations when running two simulations in one workstation?

2019-08-12 Thread Szilárd Páll
Hi,

I recommend that you use fewer MPI ranks and offload PME too manually
(e.g. 4 ranks 3 PP one PME)  -- see the manual and recent
conversations on the list related to this topic.
Depending on your system size consider launching two runs side-by-side.

Cheers
--
Szilárd

On Sat, Aug 3, 2019 at 11:11 AM sunyeping  wrote:
>
> Dear all,
>
> I am trying to run a two MD simulations on one workstation equipped with 4 
> GPU. First I started a simulation with the following command:
>
> gmx mdrun -v -deffnm md -ntmpi 12 -gpu_id 0,1
>
> By the nvidia-smi command I find the utilizations of GPU 0 and 1 are 74 and 
> 80%, respectively. Then I started another simulation with:
>
>  gmx mdrun -v -deffnm md -ntmpi 12 -gpu_id 2,3
>
> then the utilizations of GPU 0 and 1 decreased to 20% and 23%, and the 
> utilizations of GPU 2 and 3, which ran the second simulation, are 12 and 15%. 
> Both of the two simulations ran with unbearable low speed.
>
> I feel it very stange because a few days ago I also ran two simulations on 
> the same workstation with the same mdrun commands, but the utilizations of 
> all four GPU were higer than 70%. Do you know what may affect the GPU 
> utilizations and how to correct it?
>
> Best regards.
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] performance issues running gromacs with more than 1 gpu card in slurm

2019-07-30 Thread Szilárd Páll
On Tue, Jul 30, 2019 at 3:29 PM Carlos Navarro
 wrote:
>
> Hi all,
> First of all, thanks for all your valuable inputs!!.
> I tried Szilárd suggestion (multi simulations) with the following commands
> (using a single node):
>
> EXE="mpirun -np 4 gmx_mpi mdrun "
>
> cd $WORKDIR0
> #$DO_PARALLEL
> $EXE -s 4q.tpr -deffnm 4q -dlb yes -resethway -multidir 1 2 3 4
> And I noticed that the performance went from 37,32,23,22 ns/day to ~42
> ns/day in all four simulations. I check that the 80 processors were been
> used a 100% of the time, while the gpu was used about a 50% (from a 70%
> when running a single simulation in the node where I obtain a performance
> of ~50 ns/day).

Great!

Note that optimizing hardware utilization doesn't always maximize performance.

Also, manual launches with pinoffset/pinstride will give exactly the
same performance as the multi runs *if* you get the affinities right.
In your original commands you tried to use 20 of the 80 threads/rank,
but you offset the runs only by 10 (hardware threads) which means that
runs  were overlapping and interfering with each other as well as
ending up under-utilizing the hardware.

> So overall I'm quite happy with the performance I'm getting now; and
> honestly, I don't know if at some point I can get the same performance
> (running 4 jobs) that I'm getting running just one.

No, but you _may_ get a bit more aggregate performance if you run 8
concurrent jobs. Also, you cna try 1 thread per core ("mpirun -np 4
gmx mdrun_mpi -multi 4 -ntomp 10 -pin on to use only half of the
threads),

Cheers,
--
Szilárd

> Best regards,
> Carlos
>
> ——
> Carlos Navarro Retamal
> Bioinformatic Engineering. PhD.
> Postdoctoral Researcher in Center of Bioinformatics and Molecular
> Simulations
> Universidad de Talca
> Av. Lircay S/N, Talca, Chile
> E: carlos.navarr...@gmail.com or cnava...@utalca.cl
>
> On July 29, 2019 at 6:11:31 PM, Mark Abraham (mark.j.abra...@gmail.com)
> wrote:
>
> Hi,
>
> Yes and the -nmpi I copied from Carlos's post is ineffective - use -ntmpi
>
> Mark
>
>
> On Mon., 29 Jul. 2019, 15:15 Justin Lemkul,  wrote:
>
> >
> >
> > On 7/29/19 8:46 AM, Carlos Navarro wrote:
> > > Hi Mark,
> > > I tried that before, but unfortunately in that case (removing
> —gres=gpu:1
> > > and including in each line the -gpu_id flag) for some reason the jobs
> are
> > > run one at a time (one after the other), so I can’t use properly the
> > whole
> > > node.
> > >
> >
> > You need to run all but the last mdrun process in the background (&).
> >
> > -Justin
> >
> > > ——
> > > Carlos Navarro Retamal
> > > Bioinformatic Engineering. PhD.
> > > Postdoctoral Researcher in Center of Bioinformatics and Molecular
> > > Simulations
> > > Universidad de Talca
> > > Av. Lircay S/N, Talca, Chile
> > > E: carlos.navarr...@gmail.com or cnava...@utalca.cl
> > >
> > > On July 29, 2019 at 11:48:21 AM, Mark Abraham (mark.j.abra...@gmail.com)
> > > wrote:
> > >
> > > Hi,
> > >
> > > When you use
> > >
> > > DO_PARALLEL=" srun --exclusive -n 1 --gres=gpu:1 "
> > >
> > > then the environment seems to make sure only one GPU is visible. (The
> log
> > > files report only finding one GPU.) But it's probably the same GPU in
> > each
> > > case, with three remaining idle. I would suggest not using --gres unless
> > > you can specify *which* of the four available GPUs each run can use.
> > >
> > > Otherwise, don't use --gres and use the facilities built into GROMACS,
> > e.g.
> > >
> > > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 0
> > > -ntomp 20 -gpu_id 0
> > > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset
> 10
> > > -ntomp 20 -gpu_id 1
> > > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset
> 20
> > > -ntomp 20 -gpu_id 2
> > > etc.
> > >
> > > Mark
> > >
> > > On Mon, 29 Jul 2019 at 11:34, Carlos Navarro  > >
> > > wrote:
> > >
> > >> Hi Szilárd,
> > >> To answer your questions:
> > >> **are you trying to run multiple simulations concurrently on the same
> > >> node or are you trying to strong-scale?
> > >> I'm trying to run multiple simulations on the same node at the same
> > time.
> > >>
> > >> ** what are you simulating?
> > >> Regular and CompEl simulations
> > >>

Re: [gmx-users] performance issues running gromacs with more than 1 gpu card in slurm

2019-07-29 Thread Szilárd Páll
Carlos,

You can accomplish the same using the multi-simulation feature of
mdrun and avoid having to manually manage the placement of runs, e.g.
instead of the above you just write
gmx mdrun_mpi -np N -multidir $WORKDIR1 $WORKDIR2 $WORKDIR3 ...
For more details see
http://manual.gromacs.org/documentation/current/user-guide/mdrun-features.html#running-multi-simulations
Note that if the different runs have different speed, just as with
your manual launch, your machine can end up partially utilized when
some of the runs finish.

Cheers,
--
Szilárd

On Mon, Jul 29, 2019 at 2:46 PM Carlos Navarro
 wrote:
>
> Hi Mark,
> I tried that before, but unfortunately in that case (removing —gres=gpu:1
> and including in each line the -gpu_id flag) for some reason the jobs are
> run one at a time (one after the other), so I can’t use properly the whole
> node.
>
>
> ——
> Carlos Navarro Retamal
> Bioinformatic Engineering. PhD.
> Postdoctoral Researcher in Center of Bioinformatics and Molecular
> Simulations
> Universidad de Talca
> Av. Lircay S/N, Talca, Chile
> E: carlos.navarr...@gmail.com or cnava...@utalca.cl
>
> On July 29, 2019 at 11:48:21 AM, Mark Abraham (mark.j.abra...@gmail.com)
> wrote:
>
> Hi,
>
> When you use
>
> DO_PARALLEL=" srun --exclusive -n 1 --gres=gpu:1 "
>
> then the environment seems to make sure only one GPU is visible. (The log
> files report only finding one GPU.) But it's probably the same GPU in each
> case, with three remaining idle. I would suggest not using --gres unless
> you can specify *which* of the four available GPUs each run can use.
>
> Otherwise, don't use --gres and use the facilities built into GROMACS, e.g.
>
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 0
> -ntomp 20 -gpu_id 0
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 10
> -ntomp 20 -gpu_id 1
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 20
> -ntomp 20 -gpu_id 2
> etc.
>
> Mark
>
> On Mon, 29 Jul 2019 at 11:34, Carlos Navarro 
> wrote:
>
> > Hi Szilárd,
> > To answer your questions:
> > **are you trying to run multiple simulations concurrently on the same
> > node or are you trying to strong-scale?
> > I'm trying to run multiple simulations on the same node at the same time.
> >
> > ** what are you simulating?
> > Regular and CompEl simulations
> >
> > ** can you provide log files of the runs?
> > In the following link are some logs files:
> > https://www.dropbox.com/s/7q249vbqqwf5r03/Archive.zip?dl=0.
> > In short, alone.log -> single run in the node (using 1 gpu).
> > multi1/2/3/4.log ->4 independent simulations ran at the same time in a
> > single node. In all cases, 20 cpus are used.
> > Best regards,
> > Carlos
> >
> > El jue., 25 jul. 2019 a las 10:59, Szilárd Páll ()
> > escribió:
> >
> > > Hi,
> > >
> > > It is not clear to me how are you trying to set up your runs, so
> > > please provide some details:
> > > - are you trying to run multiple simulations concurrently on the same
> > > node or are you trying to strong-scale?
> > > - what are you simulating?
> > > - can you provide log files of the runs?
> > >
> > > Cheers,
> > >
> > > --
> > > Szilárd
> > >
> > > On Tue, Jul 23, 2019 at 1:34 AM Carlos Navarro
> > >  wrote:
> > > >
> > > > No one can give me an idea of what can be happening? Or how I can
> solve
> > > it?
> > > > Best regards,
> > > > Carlos
> > > >
> > > > ——
> > > > Carlos Navarro Retamal
> > > > Bioinformatic Engineering. PhD.
> > > > Postdoctoral Researcher in Center of Bioinformatics and Molecular
> > > > Simulations
> > > > Universidad de Talca
> > > > Av. Lircay S/N, Talca, Chile
> > > > E: carlos.navarr...@gmail.com or cnava...@utalca.cl
> > > >
> > > > On July 19, 2019 at 2:20:41 PM, Carlos Navarro (
> > > carlos.navarr...@gmail.com)
> > > > wrote:
> > > >
> > > > Dear gmx-users,
> > > > I’m currently working in a server where each node posses 40 physical
> > > cores
> > > > (40 threads) and 4 Nvidia-V100.
> > > > When I launch a single job (1 simulation using a single gpu card) I
> > get a
> > > > performance of about ~35ns/day in a system of about 300k atoms.
> Looking
> > > > into the usage of the video card during the simulation I notice that
> > the
> > > > c

Re: [gmx-users] Sun Solaris

2019-07-25 Thread Szilárd Páll
On Thu, Jul 25, 2019 at 11:31 AM amitabh jayaswal
 wrote:
>
> Dear All,
> *Namaskar!*
> Can GROMACS be installed and run on a Sun Solaris system?

Hi,

As long as you have modern C++ compilers and toolchain, you should be
able to do so.

> We have a robust IBM Desktop which we intend to dedicatedly use for
> GROMACS; however we are facing difficulties in installing it.

Without specifics of your difficulties, I do not think we can help out.

--
Szilárd

> The machine specifications are:
> PRODUCT NAME: IBM System x3400
> MACHINE TYPE:   7973
> SERIAL NO.:  99A8370
> PRODUCT ID:7973PAA
>
> Is there a way to progress ahead?
> Kindly provide a solution asap.
> Best
>
> *Dr. Amitabh Jayaswal*
> *PhD Bioinformatics*
> *IIT(BHU), Varanasi, India*
> *M: +91 9868 33 00 88 *(also on WhatsApp), and
> * +91 7376 019 155*
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] remd error

2019-07-25 Thread Szilárd Páll
This is an MPI / job scheduler error: you are requesting 2 nodes with
20 processes per node (=40 total), but starting 80 ranks.
--
Szilárd

On Thu, Jul 18, 2019 at 8:33 AM Bratin Kumar Das
<177cy500.bra...@nitk.edu.in> wrote:
>
> Hi,
>I am running remd simulation in gromacs-2016.5. After generating the
> multiple .tpr file in each directory by the following command
> *for i in {0..7}; do cd equil$i; gmx grompp -f equil${i}.mdp -c em.gro -p
> topol.top -o remd$i.tpr -maxwarn 1; cd ..; done*
> I run *mpirun -np 80 gmx_mpi mdrun -s remd.tpr -multi 8 -replex 1000
> -reseed 175320 -deffnm remd_equil*
> It is giving the following error
> There are not enough slots available in the system to satisfy the 40 slots
> that were requested by the application:
>   gmx_mpi
>
> Either request fewer slots for your application, or make more slots
> available
> for use.
> --
> --
> There are not enough slots available in the system to satisfy the 40 slots
> that were requested by the application:
>   gmx_mpi
>
> Either request fewer slots for your application, or make more slots
> available
> for use.
> --
> I am not understanding the error. Any suggestion will be highly
> appriciated. The mdp file and the qsub.sh file is attached below
>
> qsub.sh...
> #! /bin/bash
> #PBS -V
> #PBS -l nodes=2:ppn=20
> #PBS -l walltime=48:00:00
> #PBS -N mdrun-serial
> #PBS -j oe
> #PBS -o output.log
> #PBS -e error.log
> #cd /home/bratin/Downloads/GROMACS/Gromacs_fibril
> cd $PBS_O_WORKDIR
> module load openmpi3.0.0
> module load gromacs-2016.5
> NP='cat $PBS_NODEFILE | wc -1'
> # mpirun --machinefile $PBS_PBS_NODEFILE -np $NP 'which gmx_mpi' mdrun -v
> -s nvt.tpr -deffnm nvt
> #/apps/gromacs-2016.5/bin/mpirun -np 8 gmx_mpi mdrun -v -s remd.tpr -multi
> 8 -replex 1000 -deffnm remd_out
> for i in {0..7}; do cd equil$i; gmx grompp -f equil${i}.mdp -c em.gro -r
> em.gro -p topol.top -o remd$i.tpr -maxwarn 1; cd ..; done
>
> for i in {0..7}; do cd equil${i}; mpirun -np 40 gmx_mpi mdrun -v -s
> remd.tpr -multi 8 -replex 1000 -deffnm remd$i_out ; cd ..; done
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] performance issues running gromacs with more than 1 gpu card in slurm

2019-07-25 Thread Szilárd Páll
Hi,

It is not clear to me how are you trying to set up your runs, so
please provide some details:
- are you trying to run multiple simulations concurrently on the same
node or are you trying to strong-scale?
- what are you simulating?
- can you provide log files of the runs?

Cheers,

--
Szilárd

On Tue, Jul 23, 2019 at 1:34 AM Carlos Navarro
 wrote:
>
> No one can give me an idea of what can be happening? Or how I can solve it?
> Best regards,
> Carlos
>
> ——
> Carlos Navarro Retamal
> Bioinformatic Engineering. PhD.
> Postdoctoral Researcher in Center of Bioinformatics and Molecular
> Simulations
> Universidad de Talca
> Av. Lircay S/N, Talca, Chile
> E: carlos.navarr...@gmail.com or cnava...@utalca.cl
>
> On July 19, 2019 at 2:20:41 PM, Carlos Navarro (carlos.navarr...@gmail.com)
> wrote:
>
> Dear gmx-users,
> I’m currently working in a server where each node posses 40 physical cores
> (40 threads) and 4 Nvidia-V100.
> When I launch a single job (1 simulation using a single gpu card) I get a
> performance of about ~35ns/day in a system of about 300k atoms. Looking
> into the usage of the video card during the simulation I notice that the
> card is being used about and ~80%.
> The problems arise when I increase the number of jobs running at the same
> time. If for instance 2 jobs are running at the same time, the performance
> drops to ~25ns/day each and the usage of the video cards also drops during
> the simulation to about a ~30-40% (and sometimes dropping to less than 5%).
> Clearly there is a communication problem between the gpu cards and the cpu
> during the simulations, but I don’t know how to solve this.
> Here is the script I use to run the simulations:
>
> #!/bin/bash -x
> #SBATCH --job-name=testAtTPC1
> #SBATCH --ntasks-per-node=4
> #SBATCH --cpus-per-task=20
> #SBATCH --account=hdd22
> #SBATCH --nodes=1
> #SBATCH --mem=0
> #SBATCH --output=sout.%j
> #SBATCH --error=s4err.%j
> #SBATCH --time=00:10:00
> #SBATCH --partition=develgpus
> #SBATCH --gres=gpu:4
>
> module use /gpfs/software/juwels/otherstages
> module load Stages/2018b
> module load Intel/2019.0.117-GCC-7.3.0
> module load IntelMPI/2019.0.117
> module load GROMACS/2018.3
>
> WORKDIR1=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/1
> WORKDIR2=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/2
> WORKDIR3=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/3
> WORKDIR4=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/4
>
> DO_PARALLEL=" srun --exclusive -n 1 --gres=gpu:1 "
> EXE=" gmx mdrun "
>
> cd $WORKDIR1
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 0
> -ntomp 20 &>log &
> cd $WORKDIR2
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 10
> -ntomp 20 &>log &
> cd $WORKDIR3
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20  -nmpi 1 -pin on -pinoffset 20
> -ntomp 20 &>log &
> cd $WORKDIR4
> $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 30
> -ntomp 20 &>log &
>
>
> Regarding to pinoffset, I first tried using 20 cores for each job but then
> also tried with 8 cores (so pinoffset 0 for job 1, pinoffset 4 for job 2,
> pinoffset 8 for job 3 and pinoffset 12 for job) but at the end the problem
> persist.
>
> Currently in this machine I’m not able to use more than 1 gpu per job, so
> this is my only choice to use properly the whole node.
> If you need more information please just let me know.
> Best regards.
> Carlos
>
> ——
> Carlos Navarro Retamal
> Bioinformatic Engineering. PhD.
> Postdoctoral Researcher in Center of Bioinformatics and Molecular
> Simulations
> Universidad de Talca
> Av. Lircay S/N, Talca, Chile
> E: carlos.navarr...@gmail.com or cnava...@utalca.cl
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

[gmx-users] older server CPUs with recent GPUs for GROMACS

2019-07-25 Thread Szilárd Páll
Hi Mike,

Forking the discussion to have a consistent topic that is more discoverable.

On Thu, Jul 18, 2019 at 4:21 PM Michael Williams
 wrote:
>
> Hi Szilárd,
>
> Thanks for the interesting observations on recent hardware. I was wondering 
> if you could comment on the use of somewhat older server cpus and 
> motherboards (versus more cutting edge consumer parts). I recently noticed 
> that Haswell era Xeon cpus (E5 v3) are quite affordable now (~$400 for 12 
> core models with 40 pcie lanes) and so are the corresponding 2 cpu socket 
> server motherboards. Of course the RAM is slower than what can be used with 
> the latest Ryzen or i7/i9 cpus.


,When it comes to GPU accelerated runs, given that most of the
arithmetically-intensive computation is offloaded, major features of
more modern processors / CPU instruction sets don't help much (like
AVX512). As most bio-MD (unless running huge systems) fits in the CPU
cache, RAM performance and more memory channels also has little to no
impact (with some exceptions being 1-st gen AMD Zen arch, but that's
another topic). What dominates the performance CPU contribution of
CPUs is cache size (and speed/efficiency) and number/speed of the CPU
cores. This is somewhat of a non-trivial thing to assess as the clock
speed specs don't always reflect the stable clocks these CPUs run at,
but roughly you can count the (#core x frequency) as a metric to gauge
the performance of a CPU *in such a scenario*.

More on this you can find in our recent paper where we do in fact
compare the performance of the best bang for buck modern servers
(spoiler alert: AMD EPYC was already and will especially be the
champion with the Rome arch) with upgraded older Xeon v2 nodes; see:
https://doi.org/10.1002/jcc.26011

>
> Are there any other bottlenecks with this somewhat older server hardware that 
> I might not be aware of?

There can be: PCI topology can be an issue; you want a symmetric, e.g.
two x16 buses connected directly to each socket (for dual-socket
systems) rather than e.g. many lanes connected to a PCI switch all
connected to the same socket. You can also have significant GPU-to-GPU
communication issues on older-gen hardware (like v2/v3 Xeon), but
GROMACS does not make use of that yet (partly due to that very
reason), but with the near future releases that may also be a slight
concern if you want to scale across many GPUs.


I hope that helps, let me know if you have any other questions!

Cheers,
--
Szilárd

> Thanks again for the interesting information and practical advice on this 
> topic.
>
> Mike
>
>
> > On Jul 18, 2019, at 2:21 AM, Szilárd Páll  wrote:
> >
> > PS: You will get more PCIe lanes without motherboard trickery -- and note
> > that consumer motherboards with PCIe switches can sometimes cause
> > instabilities when under heavy compute load -- if you buy the aging and
> > quite overpriced i9 X-series like the i9-7920 with 12 cores or the
> > Threadripper 2950x 16 cores and 60 PCIe lanes.
> >
> > Also note that, but more cores always win when the CPU performance matters
> > and while 8 cores are generally sufficient, in some use-cases it may not be
> > (like runs with free energy).
> >
> > --
> > Szilárd
> >
> >
> > On Thu, Jul 18, 2019 at 10:08 AM Szilárd Páll 
> > wrote:
> >
> >> On Wed, Jul 17, 2019 at 7:00 PM Moir, Michael (MMoir) 
> >> wrote:
> >>
> >>> This is not quite true.  I certainly observed this degradation in
> >>> performance using the 9900K with two GPUs as Szilárd states using a
> >>> motherboard with one PCIe controller, but the limitation is from the
> >>> motherboard not from the CPU.
> >>
> >>
> >> Sorry, but that's not the case. PCIe controllers have been integrated into
> >> CPUs for many years; see
> >>
> >> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-introduction-basics-paper.pdf
> >>
> >> https://www.microway.com/hpc-tech-tips/common-pci-express-myths-gpu-computing/
> >>
> >> So no, the limitation is the CPU itself. Consumer CPUs these days have 24
> >> lanes total, some of which are used to connect the CPU to the chipset, and
> >> effectively you get 16-20 lanes (BTW here too the new AMD CPUs win as they
> >> provide 16 lanes for GPUs and similar devices and 4 lanes for NVMe, all on
> >> PCIe 4.0).
> >>
> >>
> >>>  It is possible to obtain a motherboard that contains two PCIe
> >>> controllers which overcomes this obstacle for not a whole lot more money.
> >>>
> >>
> >> It is possibly to buy motherboards with PCIe switches. These don't
> >> increase the number o

Re: [gmx-users] Need to install latest Gromacs on ios

2019-07-18 Thread Szilárd Páll
Hi,

Are you sure you mean iOS not OS X?

What does not work, an error message / cmake output would be more useful.
cmake generally does detect your system C++ compiler if there is one.

Cheers
--
Szilárd


On Thu, Jul 18, 2019 at 4:55 PM andrew goring 
wrote:

> Hi,
>
> I need to install the latest version of gromacs on the late Apple software.
>
> I follow the "quick and dirty" instructions, but it does not work.
>
> I believe I have all of the proper software and computers installed, as I
> have XCode up to date (although, I can't figure out how to check if c and
> c++ compilers are present).
>
> Would anyone be able to walk me through this? I think there is something
> simple I am not doing, as I do not have experience installing source code.
>
> Thanks,
>
> Andrew K. Goring
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Install on Windows 10 with AMD GPU

2019-07-18 Thread Szilárd Páll
On Thu, Jul 11, 2019 at 6:33 AM James Burchfield <
james.burchfi...@sydney.edu.au> wrote:

> I suspect the issue is that
> 64bit OpenCL is required and 32bit is enabled by default on this card.
> Apparently I can somewhere set  GPU_FORCE_64BIT_PTR=1
> But no idea how to do this yet...
>

GPU_FORCE_64BIT_PTR seems to be an environment variable which will only
affect the runtime behavior. However, for that you first need to configure
a GROMACS build and compile successfully. As far as I can tell, you still
get stuck in the first stage as cmake can not detect the required
dependencies.

--
Szilárd


>
> -Original Message-
> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
> Páll
> Sent: Tuesday, 9 July 2019 10:46 PM
> To: Discussion list for GROMACS users 
> Cc: gromacs.org_gmx-users@maillist.sys.kth.se
> Subject: Re: [gmx-users] Install on Windows 10 with AMD GPU
>
> Hi James,
>
> On Mon, Jul 8, 2019 at 10:57 AM James Burchfield <
> james.burchfi...@sydney.edu.au> wrote:
>
> > Thankyou Szilárd,
> > Headers are available here
> > https://protect-au.mimecast.com/s/-oPwCNLwM9ixRV85smybKT?domain=github
> > .com
> > But I get
> > CMake Error at cmake/gmxManageOpenCL.cmake:45 (message):
> >   OpenCL is not supported.  OpenCL version 1.2 or newer is required.
> > Call Stack (most recent call first):
> >   CMakeLists.txt:236 (include)
> >
> > I am setting
> > OpenCL_include_DIR to C:/Users/Admin/ OpenCL-Headers-master/CL
> >
>
> That path should not include "CL" (the header is expected to be included
> as CL/cl.h).
>
> Let me know if that helps.
>
> --
> Szilárd
>
>
> > OpenCL_INCLUDE_DIR OpenCL_Library to C:/Windows/System32/OpenCL.dll
> >
> >
> > The error file includes
> >   Microsoft (R) C/C++ Optimizing Compiler Version 19.21.27702.2 for
> > x64
> >
> >   CheckSymbolExists.c
> >
> >   Copyright (C) Microsoft Corporation.  All rights reserved.
> >
> >   cl /c /Zi /W3 /WX- /diagnostics:column /Od /Ob0 /D WIN32 /D _WINDOWS
> > /D "CMAKE_INTDIR=\"Debug\"" /D _MBCS /Gm- /RTC1 /MDd /GS /fp:precise
> > /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"cmTC_2c430.dir\Debug\\"
> > /Fd"cmTC_2c430.dir\Debug\vc142.pdb" /Gd /TC /errorReport:queue
> > "C:\Program Files\gromacs\CMakeFiles\CMakeTmp\CheckSymbolExists.c"
> >
> > C:\Program Files\gromacs\CMakeFiles\CMakeTmp\CheckSymbolExists.c(2,10):
> > error C1083:  Cannot open include file:
> > 'OpenCL_INCLUDE_DIR-NOTFOUND/CL/cl.h': No such file or directory
> > [C:\Program Files\gromacs\CMakeFiles\CMakeTmp\cmTC_2c430.vcxproj]
> >
> >
> > File C:/Program Files/gromacs/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
> > /* */
> > #include 
> >
> > int main(int argc, char** argv)
> > {
> >   (void)argv;
> > #ifndef CL_VERSION_1_0
> >   return ((int*)(_VERSION_1_0))[argc]; #else
> >   (void)argc;
> >   return 0;
> > #endif
> > }
> >
> >
> > Guessing it is time to give up
> >
> > Cheers
> > James
> >
> >
> >
> >
> > -Original Message-
> > From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> > gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
> > Páll
> > Sent: Friday, 5 July 2019 10:20 PM
> > To: Discussion list for GROMACS users 
> > Cc: gromacs.org_gmx-users@maillist.sys.kth.se
> > Subject: Re: [gmx-users] Install on Windows 10 with AMD GPU
> >
> > Dear James,
> >
> > Unfortunately, we have very little experience with OpenCL on Windows, so
> I
> > am afraid I can not advise you on specifics. However, note that the only
> > part of the former SDK that is needed is the OpenCL headers and loader
> > libraries (libOpenCL) which is open source software that can be obtained
> > from the standards body, KHronos. Not sure what the mechanism is for
> > Windows, but for Linux these components are in packaged in the standard
> > repositories of most distributions.
> >
> > However, before going through a large effort of trying to get GROMACS
> > running on Windows + AMD + OpenCL, you might want to consider evaluating
> > the potential benefits of the hardware. As these cards are quite dated
> you
> > might find that they do not provide enough performance benefit to warrant
> > the effort required -- especially as, if you have a workstation with
> > significant CPU resources, you might find that GROMACS runs

Re: [gmx-users] decreased performance with free energy

2019-07-18 Thread Szilárd Páll
David,

Yes, it is greatly affected. The standard interaction kernels are very
fast, but the free energy kernels are known to not be as efficient as they
could and the larger the fraction of atoms involved in perturbed
interactions the more this work dominates the runtime.

If you are trying to set up production runs on this specific
hardware/software combination that you ran the tests on? There are a few
things you could try to get a bit better performance, but details may
depend on hardware software.

Expect major improvements in the upcoming release, we are doing some
thorough rework/optimization of the free energy kernels.

Cheers,
--
Szilárd


On Thu, Jul 18, 2019 at 10:24 AM David de Sancho 
wrote:

> Thanks Szilárd
> I have posted both in the Gist below for the free energy simulation
> https://gist.github.com/daviddesancho/4abdc0d40e2355671ead7f8e40283b57
> May it have to do with the number of particles in the box that are affected
> by the typeA -> typeB change?
>
> David
>
>
> Date: Wed, 17 Jul 2019 17:09:21 +0200
> > From: Szil?rd P?ll 
> > To: Discussion list for GROMACS users 
> > Subject: Re: [gmx-users] decreased performance with free energy
> > Message-ID:
> > <
> > cannyew4uszxnnwz56tzbqsjwkt3cu7pf+8hhfxa6nfug0o7...@mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hi,
> >
> > Lower performe especially with GPUs is not unexpected, but what you
> report
> > is unusually large. I suggest you post your mdp and log file, perhaps
> there
> > are some things to improve.
> >
> > --
> > Szil?rd
> >
> >
> > On Wed, Jul 17, 2019 at 3:47 PM David de Sancho 
> > wrote:
> >
> > > Hi all
> > > I have been doing some testing for Hamiltonian replica exchange using
> > > Gromacs 2018.3 on a relatively simple system with 3000 atoms in a cubic
> > > box.
> > > For the modified hamiltonian I have simply modified the water
> > interactions
> > > by generating a typeB atom in the force field ffnonbonded.itp with
> > > different parameters file and then creating a number of tpr files for
> > > different lambda values as defined in the mdp files. The only
> difference
> > > between mdp files for a simple NVT run and for the HREX runs are the
> > > following lines:
> > >
> > > > ; H-REPLEX
> > > > free-energy = yes
> > > > init-lambda-state = 0
> > > > nstdhdl = 0
> > > > vdw_lambdas = 0.0 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> > >
> > > I have tested for performance in the same machine and compared the
> > standard
> > > NVT run performance (~175 ns/day in 8 cores) with that for the free
> > energy
> > > tpr file (6.2 ns/day).
> > > Is this performance loss what you would expect or are there any
> immediate
> > > changes you can suggest to improve things? I have found a relatively
> old
> > > post on this on Gromacs developers (
> > https://redmine.gromacs.org/issues/742
> > > ),
> > > but I am not sure whether it is the exact same problem.
> > > Thanks,
> > >
> > > David
> > > --
> > > Gromacs Users mailing list
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] make manual fails

2019-07-18 Thread Szilárd Páll
Is sphinx detected by cmake though?
--
Szilárd


On Wed, Jul 17, 2019 at 8:00 PM Michael Brunsteiner 
wrote:

> hi,so I say:prompt> cmake .. -DGMX_BUILD_OWN_FFTW=ON
> -DCMAKE_C_COMPILER=gcc-7 -DCMAKE_CXX_COMPILER=g++-7 -DGMX_GPU=on
> -DCMAKE_INSTALL_PREFIX=/home/michael/local/gromacs-2019-3-bin
> -DGMX_BUILD_MANUAL=onprompt> make -j 4prompt> make install
> prompt> make manualmanual cannot be built because Sphinx expected minimum
> version 1.6.1 is not available
>
> although I seem to have version 1.8.4 (see below)
>
> prompt> apt policy python-sphinx
> python-sphinx:
>   Installed: 1.8.4-1
>   Candidate: 1.8.4-1
>   Version table:
>  *** 1.8.4-1 500
> 500 http://ftp.at.debian.org/debian buster/main amd64 Packages
> 500 http://ftp.at.debian.org/debian buster/main i386 Packages
> 100 /var/lib/dpkg/status
>
> anybody else seen this issue?cheers,michael
>
>
>
> === Why be happy when you could be normal?
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Xeon Gold + RTX 5000

2019-07-18 Thread Szilárd Páll
PS: You will get more PCIe lanes without motherboard trickery -- and note
that consumer motherboards with PCIe switches can sometimes cause
instabilities when under heavy compute load -- if you buy the aging and
quite overpriced i9 X-series like the i9-7920 with 12 cores or the
Threadripper 2950x 16 cores and 60 PCIe lanes.

Also note that, but more cores always win when the CPU performance matters
and while 8 cores are generally sufficient, in some use-cases it may not be
(like runs with free energy).

--
Szilárd


On Thu, Jul 18, 2019 at 10:08 AM Szilárd Páll 
wrote:

> On Wed, Jul 17, 2019 at 7:00 PM Moir, Michael (MMoir) 
> wrote:
>
>> This is not quite true.  I certainly observed this degradation in
>> performance using the 9900K with two GPUs as Szilárd states using a
>> motherboard with one PCIe controller, but the limitation is from the
>> motherboard not from the CPU.
>
>
> Sorry, but that's not the case. PCIe controllers have been integrated into
> CPUs for many years; see
>
> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-introduction-basics-paper.pdf
>
> https://www.microway.com/hpc-tech-tips/common-pci-express-myths-gpu-computing/
>
> So no, the limitation is the CPU itself. Consumer CPUs these days have 24
> lanes total, some of which are used to connect the CPU to the chipset, and
> effectively you get 16-20 lanes (BTW here too the new AMD CPUs win as they
> provide 16 lanes for GPUs and similar devices and 4 lanes for NVMe, all on
> PCIe 4.0).
>
>
>>   It is possible to obtain a motherboard that contains two PCIe
>> controllers which overcomes this obstacle for not a whole lot more money.
>>
>
> It is possibly to buy motherboards with PCIe switches. These don't
> increase the number of lanes just do what a swtich does: as long as not all
> connected devices try to use the full capacity of the CPU (!) at the same
> time, you can get full speed on all connected devices.
> e.g.:
> https://techreport.com/r.x/2015_11_19_Gigabytes_Z170XGaming_G1_motherboard_reviewed/05-diagram_pcie_routing.gif
>
> Cheers,
> --
> Szilárd
>
> Mike
>>
>> -Original Message-
>> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
>> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
>> Páll
>> Sent: Wednesday, July 17, 2019 8:14 AM
>> To: Discussion list for GROMACS users 
>> Subject: [**EXTERNAL**] Re: [gmx-users] Xeon Gold + RTX 5000
>>
>> Hi Alex,
>>
>> I've not had a chance to test the new 3rd gen Ryzen CPUs, but all
>> public benchmarks out there point to the fact that they are a major
>> improvement over the previous generation Ryzen -- which were already
>> quite competitive for GPU-accelerated GROMACS runs compared to Intel,
>> especially in perf/price.
>>
>> One caveat for dual-GPU setups on the i9 9900 or the Ryzen 3900X is
>> that they don't have enough PCI lanes for peak CPU-GPU transfer (x8
>> for both of the GPUs) which will lead to a slightly less performance
>> (I'd estimate <5-10%) in particular compared to i) having a single GPU
>> plugged in into the machine ii) compare to CPUs like Threadripper or
>> the i9 79xx series processors which have more PCIe lanes.
>>
>> However, if throughput is the goal, the ideal use-case especially for
>> small simulation systems like <=50k atoms is to run e.g. 2 runs / GPU,
>> hence 4 runs on a 2-GPU system case in which the impact of the
>> aforementioned limitation will be further decreased.
>>
>> Cheers,
>> --
>> Szilárd
>>
>>
>> On Tue, Jul 16, 2019 at 7:18 PM Alex  wrote:
>> >
>> > That is excellent information, thank you. None of us have dealt with AMD
>> > CPUs in a while, so would the combination of a Ryzen 3900X and two
>> > Quadro 2080 Ti be a good choice?
>> >
>> > Again, thanks!
>> >
>> > Alex
>> >
>> >
>> > On 7/16/2019 8:41 AM, Szilárd Páll wrote:
>> > > Hi Alex,
>> > >
>> > > On Mon, Jul 15, 2019 at 8:53 PM Alex  wrote:
>> > >> Hi all and especially Szilard!
>> > >>
>> > >> My glorious management asked me to post this here. One of our group
>> > >> members, an ex-NAMD guy, wants to use Gromacs for biophysics and the
>> > >> following basics have been spec'ed for him:
>> > >>
>> > >> CPU: Xeon Gold 6244
>> > >> GPU: RTX 5000 or 6000
>> > >>
>> > >> I'll be surprised if he runs systems with more than 50K particles.
>> Could
>> > >> y

Re: [gmx-users] Xeon Gold + RTX 5000

2019-07-18 Thread Szilárd Páll
On Wed, Jul 17, 2019 at 7:00 PM Moir, Michael (MMoir) 
wrote:

> This is not quite true.  I certainly observed this degradation in
> performance using the 9900K with two GPUs as Szilárd states using a
> motherboard with one PCIe controller, but the limitation is from the
> motherboard not from the CPU.


Sorry, but that's not the case. PCIe controllers have been integrated into
CPUs for many years; see
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-introduction-basics-paper.pdf
https://www.microway.com/hpc-tech-tips/common-pci-express-myths-gpu-computing/

So no, the limitation is the CPU itself. Consumer CPUs these days have 24
lanes total, some of which are used to connect the CPU to the chipset, and
effectively you get 16-20 lanes (BTW here too the new AMD CPUs win as they
provide 16 lanes for GPUs and similar devices and 4 lanes for NVMe, all on
PCIe 4.0).


>   It is possible to obtain a motherboard that contains two PCIe
> controllers which overcomes this obstacle for not a whole lot more money.
>

It is possibly to buy motherboards with PCIe switches. These don't increase
the number of lanes just do what a swtich does: as long as not all
connected devices try to use the full capacity of the CPU (!) at the same
time, you can get full speed on all connected devices.
e.g.:
https://techreport.com/r.x/2015_11_19_Gigabytes_Z170XGaming_G1_motherboard_reviewed/05-diagram_pcie_routing.gif

Cheers,
--
Szilárd

Mike
>
> -Original Message-
> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
> Páll
> Sent: Wednesday, July 17, 2019 8:14 AM
> To: Discussion list for GROMACS users 
> Subject: [**EXTERNAL**] Re: [gmx-users] Xeon Gold + RTX 5000
>
> Hi Alex,
>
> I've not had a chance to test the new 3rd gen Ryzen CPUs, but all
> public benchmarks out there point to the fact that they are a major
> improvement over the previous generation Ryzen -- which were already
> quite competitive for GPU-accelerated GROMACS runs compared to Intel,
> especially in perf/price.
>
> One caveat for dual-GPU setups on the i9 9900 or the Ryzen 3900X is
> that they don't have enough PCI lanes for peak CPU-GPU transfer (x8
> for both of the GPUs) which will lead to a slightly less performance
> (I'd estimate <5-10%) in particular compared to i) having a single GPU
> plugged in into the machine ii) compare to CPUs like Threadripper or
> the i9 79xx series processors which have more PCIe lanes.
>
> However, if throughput is the goal, the ideal use-case especially for
> small simulation systems like <=50k atoms is to run e.g. 2 runs / GPU,
> hence 4 runs on a 2-GPU system case in which the impact of the
> aforementioned limitation will be further decreased.
>
> Cheers,
> --
> Szilárd
>
>
> On Tue, Jul 16, 2019 at 7:18 PM Alex  wrote:
> >
> > That is excellent information, thank you. None of us have dealt with AMD
> > CPUs in a while, so would the combination of a Ryzen 3900X and two
> > Quadro 2080 Ti be a good choice?
> >
> > Again, thanks!
> >
> > Alex
> >
> >
> > On 7/16/2019 8:41 AM, Szilárd Páll wrote:
> > > Hi Alex,
> > >
> > > On Mon, Jul 15, 2019 at 8:53 PM Alex  wrote:
> > >> Hi all and especially Szilard!
> > >>
> > >> My glorious management asked me to post this here. One of our group
> > >> members, an ex-NAMD guy, wants to use Gromacs for biophysics and the
> > >> following basics have been spec'ed for him:
> > >>
> > >> CPU: Xeon Gold 6244
> > >> GPU: RTX 5000 or 6000
> > >>
> > >> I'll be surprised if he runs systems with more than 50K particles.
> Could
> > >> you please comment on whether this is a cost-efficient and reasonably
> > >> powerful setup? Your past suggestions have been invaluable for us.
> > > That will be reasonably fast, but cost efficiency will be awful, to be
> honest:
> > > - that CPU is a ~$3000 part and won't perform much better than a
> > > $4-500 desktop CPU like an i9 9900, let alone a Ryzen 3900X which
> > > would be significantly faster.
> > > - Quadro cards also pretty low in bang for buck: a 2080 Ti will be
> > > close to the RTX 6000 for ~5x less and the 2080 or 2070 Super a bit
> > > slower for at least another 1.5x less.
> > >
> > > Single run at a time or possibly multiple? The proposed (or any 8+
> > > core) workstation CPU is fast enough in the majority of the
> > > simulations to pair well with two of those GPUs if used for two
> > > concurrent simulations. If that's a relevant use-case, 

Re: [gmx-users] Xeon Gold + RTX 5000

2019-07-17 Thread Szilárd Páll
Hi Alex,

I've not had a chance to test the new 3rd gen Ryzen CPUs, but all
public benchmarks out there point to the fact that they are a major
improvement over the previous generation Ryzen -- which were already
quite competitive for GPU-accelerated GROMACS runs compared to Intel,
especially in perf/price.

One caveat for dual-GPU setups on the i9 9900 or the Ryzen 3900X is
that they don't have enough PCI lanes for peak CPU-GPU transfer (x8
for both of the GPUs) which will lead to a slightly less performance
(I'd estimate <5-10%) in particular compared to i) having a single GPU
plugged in into the machine ii) compare to CPUs like Threadripper or
the i9 79xx series processors which have more PCIe lanes.

However, if throughput is the goal, the ideal use-case especially for
small simulation systems like <=50k atoms is to run e.g. 2 runs / GPU,
hence 4 runs on a 2-GPU system case in which the impact of the
aforementioned limitation will be further decreased.

Cheers,
--
Szilárd


On Tue, Jul 16, 2019 at 7:18 PM Alex  wrote:
>
> That is excellent information, thank you. None of us have dealt with AMD
> CPUs in a while, so would the combination of a Ryzen 3900X and two
> Quadro 2080 Ti be a good choice?
>
> Again, thanks!
>
> Alex
>
>
> On 7/16/2019 8:41 AM, Szilárd Páll wrote:
> > Hi Alex,
> >
> > On Mon, Jul 15, 2019 at 8:53 PM Alex  wrote:
> >> Hi all and especially Szilard!
> >>
> >> My glorious management asked me to post this here. One of our group
> >> members, an ex-NAMD guy, wants to use Gromacs for biophysics and the
> >> following basics have been spec'ed for him:
> >>
> >> CPU: Xeon Gold 6244
> >> GPU: RTX 5000 or 6000
> >>
> >> I'll be surprised if he runs systems with more than 50K particles. Could
> >> you please comment on whether this is a cost-efficient and reasonably
> >> powerful setup? Your past suggestions have been invaluable for us.
> > That will be reasonably fast, but cost efficiency will be awful, to be 
> > honest:
> > - that CPU is a ~$3000 part and won't perform much better than a
> > $4-500 desktop CPU like an i9 9900, let alone a Ryzen 3900X which
> > would be significantly faster.
> > - Quadro cards also pretty low in bang for buck: a 2080 Ti will be
> > close to the RTX 6000 for ~5x less and the 2080 or 2070 Super a bit
> > slower for at least another 1.5x less.
> >
> > Single run at a time or possibly multiple? The proposed (or any 8+
> > core) workstation CPU is fast enough in the majority of the
> > simulations to pair well with two of those GPUs if used for two
> > concurrent simulations. If that's a relevant use-case, I'd recommend
> > two 2070 Super or 2080 cards.
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> >> Thank you,
> >>
> >> Alex
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at 
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send 
> >> a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] decreased performance with free energy

2019-07-17 Thread Szilárd Páll
Hi,

Lower performe especially with GPUs is not unexpected, but what you report
is unusually large. I suggest you post your mdp and log file, perhaps there
are some things to improve.

--
Szilárd


On Wed, Jul 17, 2019 at 3:47 PM David de Sancho 
wrote:

> Hi all
> I have been doing some testing for Hamiltonian replica exchange using
> Gromacs 2018.3 on a relatively simple system with 3000 atoms in a cubic
> box.
> For the modified hamiltonian I have simply modified the water interactions
> by generating a typeB atom in the force field ffnonbonded.itp with
> different parameters file and then creating a number of tpr files for
> different lambda values as defined in the mdp files. The only difference
> between mdp files for a simple NVT run and for the HREX runs are the
> following lines:
>
> > ; H-REPLEX
> > free-energy = yes
> > init-lambda-state = 0
> > nstdhdl = 0
> > vdw_lambdas = 0.0 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>
> I have tested for performance in the same machine and compared the standard
> NVT run performance (~175 ns/day in 8 cores) with that for the free energy
> tpr file (6.2 ns/day).
> Is this performance loss what you would expect or are there any immediate
> changes you can suggest to improve things? I have found a relatively old
> post on this on Gromacs developers (https://redmine.gromacs.org/issues/742
> ),
> but I am not sure whether it is the exact same problem.
> Thanks,
>
> David
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] rtx 2080 gpu

2019-07-17 Thread Szilárd Páll
On Wed, Jul 17, 2019 at 2:13 PM Stefano Guglielmo <
stefano.guglie...@unito.it> wrote:

> Hi Benson,
> thanks for your answer and sorry for my delay: in the meantime I had to
> restore the OS. I obviously re-installed NVIDIA driver (430.64) and CUDA
> 10.1, I re-compiled Gromacs 2019.2 with the following command:
>
> cmake .. -DGMX_BUILD_OWN_FFTW=ON -DGMX_SIMD=AVX2_256 -DGMX_GPU=ON
> -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DREGRESSIONTEST_DOWNLOAD=ON
>
> I did make test and I got 100% passed. but this is the log file:
>
> GROMACS:  gmx mdrun, version 2019.2
> Executable:   /usr/local/gromacs/bin/gmx
> Data prefix:  /usr/local/gromacs
> Working dir:  /home/stefano/CB2
> Process ID:   117020
> Command line:
>   gmx mdrun -deffnm cb2_trz2c3ohene -ntmpi 1 -pin on
>
> GROMACS version:2019.2
> Precision:  single
> Memory model:   64 bit
> MPI library:thread_mpi
> OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
> GPU support:CUDA
> SIMD instructions:  AVX2_256
> FFT library:fftw-3.3.8-sse2-avx-avx2-avx2_128
> RDTSCP usage:   enabled
> TNG support:enabled
> Hwloc support:  disabled
> Tracing support:disabled
> C compiler: /usr/bin/cc GNU 4.8.5
> C compiler flags:-mavx2 -mfma -O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> C++ compiler:   /usr/bin/c++ GNU 4.8.5
> C++ compiler flags:  -mavx2 -mfma-std=c++11   -O3 -DNDEBUG
> -funroll-all-loops -fexcess-precision=fast
> CUDA compiler:  /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler
> driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on
> Wed_Apr_24_19:10:27_PDT_2019;Cuda compilation tools, release 10.1,
> V10.1.168
> CUDA compiler
>
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;;
>
> ;-mavx2;-mfma;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> CUDA driver:10.10
> CUDA runtime:   N/A
>
> NOTE: Detection of GPUs failed. The API reported:
>   unknown error
>   GROMACS cannot run tasks on a GPU.
>
> Running on 1 node with total 32 cores, 64 logical cores, 0 compatible GPUs
> Hardware detected:
>   CPU info:
> Vendor: AMD
> Brand:  AMD Ryzen Threadripper 2990WX 32-Core Processor
> Family: 23   Model: 8   Stepping: 2
> Features: aes amd apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf
> misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp
> sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3
>   Hardware topology: Basic
> Sockets, cores, and logical processors:
>   Socket  0: [   0  32] [   1  33] [   2  34] [   3  35] [   4  36] [
> 5  37] [   6  38] [   7  39] [  16  48] [  17  49] [  18  50] [  19  51] [
>  20  52] [  21  53] [  22  54] [  23  55] [   8  40] [   9  41] [  10  42]
> [  11  43] [  12  44] [  13  45] [  14  46] [  15  47] [  24  56] [  25
>  57] [  26  58] [  27  59] [  28  60] [  29  61] [  30  62] [  31  63]
>
> Do you have any suggestions?
>
> PS: I set SIMD option to AVX2_256 with an AMD Ryzen Threadripper 2990WX
> 32-Core Processor: do you think it is a good idea?
>

In general, I suggest you stick to the defaults (which is not AVX2_128),
this will typically be faster, in particular in CPU-only runs. The
difference may not be significant in GPU-accelerated runs and (in some no
too common cases it can even be a little bit faster with AVX2_256).

Cheers,
--
Szilárd


>
> Thanks again
> Stefano
>
> Il giorno mer 10 lug 2019 alle ore 08:13 Benson Muite <
> benson_mu...@emailplus.org> ha scritto:
>
> > Hi Stefano,
> >
> > What was your compilation command? (it may be helpful to add SIMD
> > support appropriate to your processor
> >
> >
> http://manual.gromacs.org/documentation/current/install-guide/index.html#simd-support
> > )
> >
> > Did you run make test after compiling?
> >
> > Benson
> >
> > On 7/10/19 1:18 AM, Stefano Guglielmo wrote:
> > > Dear all,
> > > I have a centOS machine equipped with two RTX 2080 cards, with nvidia
> > > drivers 430.2; I installed cuda toolkit 10-1. when executing mdrun the
> > log
> > > reported the following message:
> > >
> > > GROMACS version:2019.2
> > > Precision:  single
> > > Memory model:   64 bit
> > > MPI library:thread_mpi
> > > OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
> > > GPU support:CUDA
> > > SIMD instructions:  NONE
> > > FFT library:fftw-3.3.8
> > > RDTSCP usage:   disabled
> > > TNG support:enabled
> > > Hwloc support:  disabled
> > > Tracing support:disabled
> > > C compiler: /usr/bin/cc GNU 4.8.5
> > > C compiler flags:-O3 -DNDEBUG -funroll-all-loops
> > > -fexcess-precision=fast
> > > C++ compiler:   

Re: [gmx-users] rtx 2080 gpu

2019-07-17 Thread Szilárd Páll
On Wed, Jul 10, 2019 at 2:18 AM Stefano Guglielmo <
stefano.guglie...@unito.it> wrote:

> Dear all,
> I have a centOS machine equipped with two RTX 2080 cards, with nvidia
> drivers 430.2; I installed cuda toolkit 10-1. when executing mdrun the log
> reported the following message:
>
> GROMACS version:2019.2
> Precision:  single
> Memory model:   64 bit
> MPI library:thread_mpi
> OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
> GPU support:CUDA
> SIMD instructions:  NONE
> FFT library:fftw-3.3.8
> RDTSCP usage:   disabled
> TNG support:enabled
> Hwloc support:  disabled
> Tracing support:disabled
> C compiler: /usr/bin/cc GNU 4.8.5
> C compiler flags:-O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> C++ compiler:   /usr/bin/c++ GNU 4.8.5
> C++ compiler flags: -std=c++11   -O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> CUDA compiler:  /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler
> driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on
> Wed_Apr_24_19:10:27_PDT_2019;Cuda compilation tools, release 10.1,
> V10.1.168
> CUDA compiler
>
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;;
> ;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> CUDA driver:10.20
> CUDA runtime:   N/A
>

ˆˆˆ
Something was not correct about your CUDA runtime installation.

--
Szilárd


>
> NOTE: Detection of GPUs failed. The API reported:
>   unknown error
>   GROMACS cannot run tasks on a GPU.
>
> Does anyone have any suggestions?
> Thanks in advance
> Stefano
>
>
>
> --
> Stefano GUGLIELMO PhD
> Assistant Professor of Medicinal Chemistry
> Department of Drug Science and Technology
> Via P. Giuria 9
> 10125 Turin, ITALY
> ph. +39 (0)11 6707178
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Xeon Gold + RTX 5000

2019-07-16 Thread Szilárd Páll
Hi Alex,

On Mon, Jul 15, 2019 at 8:53 PM Alex  wrote:
>
> Hi all and especially Szilard!
>
> My glorious management asked me to post this here. One of our group
> members, an ex-NAMD guy, wants to use Gromacs for biophysics and the
> following basics have been spec'ed for him:
>
> CPU: Xeon Gold 6244
> GPU: RTX 5000 or 6000
>
> I'll be surprised if he runs systems with more than 50K particles. Could
> you please comment on whether this is a cost-efficient and reasonably
> powerful setup? Your past suggestions have been invaluable for us.

That will be reasonably fast, but cost efficiency will be awful, to be honest:
- that CPU is a ~$3000 part and won't perform much better than a
$4-500 desktop CPU like an i9 9900, let alone a Ryzen 3900X which
would be significantly faster.
- Quadro cards also pretty low in bang for buck: a 2080 Ti will be
close to the RTX 6000 for ~5x less and the 2080 or 2070 Super a bit
slower for at least another 1.5x less.

Single run at a time or possibly multiple? The proposed (or any 8+
core) workstation CPU is fast enough in the majority of the
simulations to pair well with two of those GPUs if used for two
concurrent simulations. If that's a relevant use-case, I'd recommend
two 2070 Super or 2080 cards.

Cheers,
--
Szilárd


> Thank you,
>
> Alex
> --
> Gromacs Users mailing list
>
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
> mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GPU support on macOS 10.14

2019-07-15 Thread Szilárd Páll
PS: have you tried mixed mode PME (-pmefft)? That could avoid the
Apple OpenCL with clFFT issue.
--
Szilárd

On Mon, Jul 15, 2019 at 4:39 PM Szilárd Páll  wrote:
>
> Hi,
>
> Thanks for the detailed report. Unfortunately, it seems that there is
> indeed an Apple OpenCL compiler issue with clFFT as you observe, but I
> am not convinced this is limited to AMD GPUs. I do not believe you are
> getting PME offload with the Intel iGPU: PME offload support is
> disabled on Intel and NVIDIA (because of limitations of the initial
> PME OpenCL implementation that, due to no performance benefit we did
> not consider re-enabling in the 2019 release).
>
> In my experience, even after re-enabling PME OpenCL on the the Intel
> CPU+iGPU setup, PME on the CPU and only the nonbondeds on the GPU is
> faster; note that this is a power-limited CPU, the fixed power budget
> is split between CPU cores and iGPU.
>
> Cheers,
> --
> Szilárd
>
> On Thu, Jul 11, 2019 at 11:46 AM Falk Hoffmann  wrote:
> >
> > Hi,
> >
> > 13-inch MacBook Pros run with an Intel Iris GPU and I wanted to point out 
> > that I did not test it with this GPU. Sorry for the missunderstanding.
> >
> > Falk
> >
> >
> > Gesendet: Mittwoch, 03. Juli 2019 um 17:05 Uhr
> > Von: "Michael Williams" 
> > An: gmx-us...@gromacs.org
> > Betreff: Re: [gmx-users] GPU support on macOS 10.14
> > Hi Falk,
> >
> > I actually compiled Gromacs with openCL GPU support on my MacBook Pro 
> > (2017) a couple months ago (with some help from others on this mailing 
> > list). I’ve pasted below the cmake settings that worked. You might be able 
> > to find more details searching the list for that thread... it took me a 
> > little while to get it working. Definitely worthwhile though. Have a good 
> > one,
> >
> > Mike
> >
> > cmake .. 
> > -DCMAKE_INSTALL_PREFIX=/Users/michael/.local/apps/gromacs-2018.5-apple-clang-omp-ocl
> >  -DCMAKE_LIBRARY_PATH=/Users/michael/.local/lib 
> > -DCMAKE_INCLUDE_PATH=/Users/michael/.local/include 
> > -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ 
> > -DCMAKE_C_FLAGS="-Xpreprocessor -fopenmp -lomp -L/Users/michael/.local/lib 
> > -I/Users/michael/.local/include" -DCMAKE_CXX_FLAGS="-Xpreprocessor -fopenmp 
> > -lomp -L/Users/michael/.local/lib -I/Users/michael/.local/include" 
> > -DGMX_FFT_LIBRARY=fftw3 -DGMX_GPU=ON -DGMX_USE_OPENCL=ON
> >
> > My $HOME/.local directory is the prefix I used to install appropriate 
> > versions of hwloc (1.11.12), libomp (7.0.1), and fftw3 (3.3.8) that I 
> > compiled with the system’s default clang (in OSX 10.14.3, Mojave).
> >
> > > On Jul 3, 2019, at 8:41 AM, Mark Abraham  wrote:
> > >
> > > Hi,
> > >
> > >> On Wed., 3 Jul. 2019, 15:47 Falk Hoffmann,  wrote:
> > >>
> > >> Hi!
> > >>
> > >> I have a question regarding the GPU support of GROMACS for MacOS. The
> > >> newest MacBooks (after 2015) are equipped with AMD GPUs which means that
> > >> GROMACS cannot be compiled with CUDA. GROMACS could be compiled with 
> > >> OpenCL
> > >> until MacOS 10.13. But this is not possible for MacBooks which have been
> > >> updated to Mojave (10.14) because Apple deprecated OpenCL (and OpenGL) in
> > >> this OS in favor of Metal2.
> > >>
> > >
> > > Indeed Apple deprecated such support, but it is not yet removed and AFAIK
> > > is still viable for running codes that require them. Apple's own website
> > > declares many models that support OpenCL, but does not note whether there
> > > are OS version constraints... I assume that 10.14 and up don't install 
> > > such
> > > components by default any more, but I haven't heard that it is impossible
> > > to do so... Can anyone confirm or deny?
> > >
> > >> Regarding
> > >
> > >> the improved performance of GPUs especially for nonbonded interactions of
> > >> biomolecules in bigger systems it would be very useful to use the power 
> > >> of
> > >> the AMD GPUs in the future. Is there any plan to provide support for 
> > >> Metal2
> > >> in new releases of GROMACS in the near future?
> > >>
> > >
> > > It would be wildly unlikely for GROMACS to support a second proprietary
> > > language, and particularly one that is not used in the HPC sector. I hope 
> > > a
> > > way forward will emerge, but all we know now is that it won't be Metal2 
>

Re: [gmx-users] GPU support on macOS 10.14

2019-07-15 Thread Szilárd Páll
Hi,

Thanks for the detailed report. Unfortunately, it seems that there is
indeed an Apple OpenCL compiler issue with clFFT as you observe, but I
am not convinced this is limited to AMD GPUs. I do not believe you are
getting PME offload with the Intel iGPU: PME offload support is
disabled on Intel and NVIDIA (because of limitations of the initial
PME OpenCL implementation that, due to no performance benefit we did
not consider re-enabling in the 2019 release).

In my experience, even after re-enabling PME OpenCL on the the Intel
CPU+iGPU setup, PME on the CPU and only the nonbondeds on the GPU is
faster; note that this is a power-limited CPU, the fixed power budget
is split between CPU cores and iGPU.

Cheers,
--
Szilárd

On Thu, Jul 11, 2019 at 11:46 AM Falk Hoffmann  wrote:
>
> Hi,
>
> 13-inch MacBook Pros run with an Intel Iris GPU and I wanted to point out 
> that I did not test it with this GPU. Sorry for the missunderstanding.
>
> Falk
>
>
> Gesendet: Mittwoch, 03. Juli 2019 um 17:05 Uhr
> Von: "Michael Williams" 
> An: gmx-us...@gromacs.org
> Betreff: Re: [gmx-users] GPU support on macOS 10.14
> Hi Falk,
>
> I actually compiled Gromacs with openCL GPU support on my MacBook Pro (2017) 
> a couple months ago (with some help from others on this mailing list). I’ve 
> pasted below the cmake settings that worked. You might be able to find more 
> details searching the list for that thread... it took me a little while to 
> get it working. Definitely worthwhile though. Have a good one,
>
> Mike
>
> cmake .. 
> -DCMAKE_INSTALL_PREFIX=/Users/michael/.local/apps/gromacs-2018.5-apple-clang-omp-ocl
>  -DCMAKE_LIBRARY_PATH=/Users/michael/.local/lib 
> -DCMAKE_INCLUDE_PATH=/Users/michael/.local/include 
> -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ 
> -DCMAKE_C_FLAGS="-Xpreprocessor -fopenmp -lomp -L/Users/michael/.local/lib 
> -I/Users/michael/.local/include" -DCMAKE_CXX_FLAGS="-Xpreprocessor -fopenmp 
> -lomp -L/Users/michael/.local/lib -I/Users/michael/.local/include" 
> -DGMX_FFT_LIBRARY=fftw3 -DGMX_GPU=ON -DGMX_USE_OPENCL=ON
>
> My $HOME/.local directory is the prefix I used to install appropriate 
> versions of hwloc (1.11.12), libomp (7.0.1), and fftw3 (3.3.8) that I 
> compiled with the system’s default clang (in OSX 10.14.3, Mojave).
>
> > On Jul 3, 2019, at 8:41 AM, Mark Abraham  wrote:
> >
> > Hi,
> >
> >> On Wed., 3 Jul. 2019, 15:47 Falk Hoffmann,  wrote:
> >>
> >> Hi!
> >>
> >> I have a question regarding the GPU support of GROMACS for MacOS. The
> >> newest MacBooks (after 2015) are equipped with AMD GPUs which means that
> >> GROMACS cannot be compiled with CUDA. GROMACS could be compiled with OpenCL
> >> until MacOS 10.13. But this is not possible for MacBooks which have been
> >> updated to Mojave (10.14) because Apple deprecated OpenCL (and OpenGL) in
> >> this OS in favor of Metal2.
> >>
> >
> > Indeed Apple deprecated such support, but it is not yet removed and AFAIK
> > is still viable for running codes that require them. Apple's own website
> > declares many models that support OpenCL, but does not note whether there
> > are OS version constraints... I assume that 10.14 and up don't install such
> > components by default any more, but I haven't heard that it is impossible
> > to do so... Can anyone confirm or deny?
> >
> >> Regarding
> >
> >> the improved performance of GPUs especially for nonbonded interactions of
> >> biomolecules in bigger systems it would be very useful to use the power of
> >> the AMD GPUs in the future. Is there any plan to provide support for Metal2
> >> in new releases of GROMACS in the near future?
> >>
> >
> > It would be wildly unlikely for GROMACS to support a second proprietary
> > language, and particularly one that is not used in the HPC sector. I hope a
> > way forward will emerge, but all we know now is that it won't be Metal2 :-)
> >
> > Mark
> >
> > Or is there another way to compile GROMACS with GPU support under
> >> MacOS>=10.14? I tried it with OpenCL, but of course it did not work. I
> >> could perfectly compile it without GPU support and run it on CPUs only, but
> >> this is not my intention.
> >>
> >> Kind regards,
> >> Falk
> >>
> >> BTW: If this email is better placed in the GROMACS developers list, please
> >> move it there.
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read 
> >> http://www.gromacs.org/Support/Mailing_Lists[http://www.gromacs.org/Support/Mailing_Lists]
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users[https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users]
> >>  or
> >> send a mail to gmx-users-requ...@gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at 
> > 

Re: [gmx-users] Install on Windows 10 with AMD GPU

2019-07-09 Thread Szilárd Páll
Hi James,

On Mon, Jul 8, 2019 at 10:57 AM James Burchfield <
james.burchfi...@sydney.edu.au> wrote:

> Thankyou Szilárd,
> Headers are available here https://github.com/KhronosGroup/OpenCL-Headers
> But I get
> CMake Error at cmake/gmxManageOpenCL.cmake:45 (message):
>   OpenCL is not supported.  OpenCL version 1.2 or newer is required.
> Call Stack (most recent call first):
>   CMakeLists.txt:236 (include)
>
> I am setting
> OpenCL_include_DIR to C:/Users/Admin/ OpenCL-Headers-master/CL
>

That path should not include "CL" (the header is expected to be included as
CL/cl.h).

Let me know if that helps.

--
Szilárd


> OpenCL_INCLUDE_DIR OpenCL_Library to C:/Windows/System32/OpenCL.dll
>
>
> The error file includes
>   Microsoft (R) C/C++ Optimizing Compiler Version 19.21.27702.2 for x64
>
>   CheckSymbolExists.c
>
>   Copyright (C) Microsoft Corporation.  All rights reserved.
>
>   cl /c /Zi /W3 /WX- /diagnostics:column /Od /Ob0 /D WIN32 /D _WINDOWS /D
> "CMAKE_INTDIR=\"Debug\"" /D _MBCS /Gm- /RTC1 /MDd /GS /fp:precise
> /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"cmTC_2c430.dir\Debug\\"
> /Fd"cmTC_2c430.dir\Debug\vc142.pdb" /Gd /TC /errorReport:queue "C:\Program
> Files\gromacs\CMakeFiles\CMakeTmp\CheckSymbolExists.c"
>
> C:\Program Files\gromacs\CMakeFiles\CMakeTmp\CheckSymbolExists.c(2,10):
> error C1083:  Cannot open include file:
> 'OpenCL_INCLUDE_DIR-NOTFOUND/CL/cl.h': No such file or directory
> [C:\Program Files\gromacs\CMakeFiles\CMakeTmp\cmTC_2c430.vcxproj]
>
>
> File C:/Program Files/gromacs/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
> /* */
> #include 
>
> int main(int argc, char** argv)
> {
>   (void)argv;
> #ifndef CL_VERSION_1_0
>   return ((int*)(_VERSION_1_0))[argc];
> #else
>   (void)argc;
>   return 0;
> #endif
> }
>
>
> Guessing it is time to give up
>
> Cheers
> James
>
>
>
>
> -Original Message-
> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
> Páll
> Sent: Friday, 5 July 2019 10:20 PM
> To: Discussion list for GROMACS users 
> Cc: gromacs.org_gmx-users@maillist.sys.kth.se
> Subject: Re: [gmx-users] Install on Windows 10 with AMD GPU
>
> Dear James,
>
> Unfortunately, we have very little experience with OpenCL on Windows, so I
> am afraid I can not advise you on specifics. However, note that the only
> part of the former SDK that is needed is the OpenCL headers and loader
> libraries (libOpenCL) which is open source software that can be obtained
> from the standards body, KHronos. Not sure what the mechanism is for
> Windows, but for Linux these components are in packaged in the standard
> repositories of most distributions.
>
> However, before going through a large effort of trying to get GROMACS
> running on Windows + AMD + OpenCL, you might want to consider evaluating
> the potential benefits of the hardware. As these cards are quite dated you
> might find that they do not provide enough performance benefit to warrant
> the effort required -- especially as, if you have a workstation with
> significant CPU resources, you might find that GROMACS runs nearly as fast
> or faster on the CPU only (that's because we have very efficient CPU SIMD
> code for all compute-intensive work).
>
> To do a hopefully easier quick performance evaluation, you could simply
> boot a Linux distribution off of an external disk, you can find Linux
> drivers for them for Ubuntu 16.04/18.04 at least which you can install and
> see how well does the system perform.
>
> I hope that helps!
>
> Cheers,
> --
> Szilárd
>
>
> On Fri, Jul 5, 2019 at 9:11 AM James Burchfield <
> james.burchfi...@sydney.edu.au> wrote:
>
> > Hi there,
> >
> > I was hoping to install  gromacs on a windows10 system that runs 2 AMD
> > Firepro cards.
> > I have managed to achieve almost everything in terms of setting up the
> > compilation with the exception of OpenCl.
> >
> > The issue I have run into are in reference to the settings of
> >
> > CMAKE_PREFIX_PATH  (unsure if I need to put anything here)
> > OpenCL_INCLUDE_DIR OpenCL_Library
> >
> > The cards I am running are~ 5years old Firepro W7100 OpenCL version
> > they are running is 2.0
> >
> >
> > AMD no longer makes the OpenCL SDK
> > Apparrently, most of the relevant stuff is now included with the
> > drivers and headers can be  downloaded from GitHub
> >
> > I have tried
> >
> >   *   Installing the old SDK
> >   *   Installing the newer "light" version of the S

Re: [gmx-users] Install on Windows 10 with AMD GPU

2019-07-05 Thread Szilárd Páll
Dear James,

Unfortunately, we have very little experience with OpenCL on Windows, so I
am afraid I can not advise you on specifics. However, note that the only
part of the former SDK that is needed is the OpenCL headers and loader
libraries (libOpenCL) which is open source software that can be obtained
from the standards body, KHronos. Not sure what the mechanism is for
Windows, but for Linux these components are in packaged in the standard
repositories of most distributions.

However, before going through a large effort of trying to get GROMACS
running on Windows + AMD + OpenCL, you might want to consider evaluating
the potential benefits of the hardware. As these cards are quite dated you
might find that they do not provide enough performance benefit to warrant
the effort required -- especially as, if you have a workstation with
significant CPU resources, you might find that GROMACS runs nearly as fast
or faster on the CPU only (that's because we have very efficient CPU SIMD
code for all compute-intensive work).

To do a hopefully easier quick performance evaluation, you could simply
boot a Linux distribution off of an external disk, you can find Linux
drivers for them for Ubuntu 16.04/18.04 at least which you can install and
see how well does the system perform.

I hope that helps!

Cheers,
--
Szilárd


On Fri, Jul 5, 2019 at 9:11 AM James Burchfield <
james.burchfi...@sydney.edu.au> wrote:

> Hi there,
>
> I was hoping to install  gromacs on a windows10 system that runs 2 AMD
> Firepro cards.
> I have managed to achieve almost everything in terms of setting up the
> compilation with the exception of OpenCl.
>
> The issue I have run into are in reference to the settings of
>
> CMAKE_PREFIX_PATH  (unsure if I need to put anything here)
> OpenCL_INCLUDE_DIR
> OpenCL_Library
>
> The cards I am running are~ 5years old Firepro W7100
> OpenCL version they are running is 2.0
>
>
> AMD no longer makes the OpenCL SDK
> Apparrently, most of the relevant stuff is now included with the drivers
> and headers can be  downloaded from GitHub
>
> I have tried
>
>   *   Installing the old SDK
>   *   Installing the newer "light" version of the SDK
>   *   Downloading the headers
>
> Whatever the case
> I cannot get it to work .
> I get an error saying that the minimum requirement is openCl 1.2.
> According to the AMD driver the cards are running opencl 2.0
>
> Any help would be appreciated
>
> Cheers,
> James
>
>
> 
> Dr James Burchfield
> Group Leader - Molecular Imaging
> Metabolic Cybernetics Laboratory | School of Life and Environmental
> Sciences
> D17 - Charles Perkins Centre | The University of Sydney | NSW | 2006
> email: james.burchfi...@sydney.edu.au<
> https://webmail.sydney.edu.au/owa/redir.aspx?SURL=dcc-mx2Vm9YdGp7rTcr5GrRHiQ9SDxLH3PlkEA8SH4UuqJopr4HSCG0AYQBpAGwAdABvADoAagBhAG0AZQBzAC4AYgB1AHIAYwBoAGYAaQBlAGwAZABAAHMAeQBkAG4AZQB5AC4AZQBkAHUALgBhAHUA=mailto%3ajames.burchfield%40sydney.edu.au
> >
> phone: +61 (0) 403 977 448
> web: http://sydney.edu.au/perkins/research/groups/david-james-lab.shtml
> 
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] using GPU acceleration in gromacs

2019-05-27 Thread Szilárd Páll
The log file output will show whether the device is supported. In general,
all GPUs of compute capability >2.0 (codename "Fermi") are supported, and
compute capability 2.0 devices were deprecated in the 2019 release (and are
not supported in CUDA since mid 2017 either).

--
Szilárd


On Fri, May 24, 2019 at 2:37 PM Pragati Sharma 
wrote:

> Thanks. Is there any way to check which release of GROMACS supports which
> architecture of  NVIDIA GPUs.
>
> On Fri, May 24, 2019 at 4:48 PM Szilárd Páll 
> wrote:
>
> > Note that if it is indeed a Quadro 5000 (not P5000, M5000, or K5000) you
> > will need an older GROMACS release as the architecture of that GPUs has
> > been deprecated.
> >
> > --
> > Szilárd
> >
> >
> > On Fri, May 24, 2019 at 8:59 AM Pragati Sharma 
> > wrote:
> >
> > > Hello users,
> > >
> > > I am trying to install gromacs-2019 on a HP workstation containing
> > > NVIDIA-quadro 5000 GPU card.
> > >
> > > I installed gromacs using ‘sudo apt-get install gromacs’ on opensuse.
> > After
> > > running a polymer simulation, I checked the log file and it is showing
> > > GPU-disabled in Gromacs properties. I need to know, if it is not using
> > GPU
> > > because of the quick installation, or there can be other reasons.
> Should
> > I
> > > manually reinstall gromacs using cmake with ‘-DGMX GPU=ON’. OR Are
> there
> > > other things that can be checked or done to make the gromacs use GPU
> > > acceleration.  Any help would be appreciated.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-requ...@gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] using GPU acceleration in gromacs

2019-05-24 Thread Szilárd Páll
Note that if it is indeed a Quadro 5000 (not P5000, M5000, or K5000) you
will need an older GROMACS release as the architecture of that GPUs has
been deprecated.

--
Szilárd


On Fri, May 24, 2019 at 8:59 AM Pragati Sharma 
wrote:

> Hello users,
>
> I am trying to install gromacs-2019 on a HP workstation containing
> NVIDIA-quadro 5000 GPU card.
>
> I installed gromacs using ‘sudo apt-get install gromacs’ on opensuse. After
> running a polymer simulation, I checked the log file and it is showing
> GPU-disabled in Gromacs properties. I need to know, if it is not using GPU
> because of the quick installation, or there can be other reasons. Should I
> manually reinstall gromacs using cmake with ‘-DGMX GPU=ON’. OR Are there
> other things that can be checked or done to make the gromacs use GPU
> acceleration.  Any help would be appreciated.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] implicit water simulation, Gromacs5.x, point decomposition

2019-05-21 Thread Szilárd Páll
As far as I recall at most 2 ranks were supported; use OpenMP, I suggest.
--
Szilárd


On Sat, May 11, 2019 at 3:11 PM Halima Mouhib  wrote:

>  Hi,
> I have a question on how to run implicit water simulations using the
> Gromacs5.x series.
> Unfortunately, there is a problem with the domain decomposition when using
> the command:  gmx mdrun -deffnm md -v
>
> #
> Fatal error:
> Domain decomposition does not support simple neighbor searching, use grid
> searching or run with one MPI rank
> #
>
> and it works fine with one MPI rank ( gmx mdrun -deffnm md -v -nt 1 ), but
> I need it to run it in parallel otherwise it is too slow.
> In the previous Gromacs4.x versions, this could simply be solved using the
> point decomposition method (mdrun -pd).
>
>   How this has been replaced in Gromacs5.x?
>
> Thanks a lot in advance!Lima
>
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] NTMPI / NTOMP combination: 10 threads not "reasonable" for GROMACS?

2019-05-10 Thread Szilárd Páll
That is just a hint, if you measured and you are getting better performance
with 10 threads, use that setting. (Also note that the message suggests "4
to 6" rather than "4 or 6" threads).

Do you also get the same note with the 2019 release?

--
Szilárd


On Fri, May 10, 2019 at 6:27 PM Téletchéa Stéphane <
stephane.teletc...@univ-nantes.fr> wrote:

> Le 20/03/2019 à 22:42, Stéphane Téletchéa a écrit :
> > Dear all,
> >
>
> > Those CPUs are 10 cores +HT (so not 4 or 6). Is it only a warning ?
>
> Dear all,
>
> Any answer from core developpers on this, should a file a bug?
>
> Best,
>
> Stéphane
>
> --
> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein
> Design In Silico
> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322
> Nantes cedex 03, France
> Tél : +33 251 125 636 / Fax : +33 251 125 632
> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Gromacs 2019.2 on Power9 + Volta GPUs (building and running)

2019-05-09 Thread Szilárd Páll
On Thu, May 9, 2019 at 10:01 PM Alex  wrote:

> Okay, we're positively unable to run a Gromacs (2019.1) test on Power9. The
> test procedure is simple, using slurm:
> 1. Request an interactive session: > srun -N 1 -n 20 --pty
> --partition=debug --time=1:00:00 --gres=gpu:1 bash
> 2. Load CUDA library: module load cuda
> 3. Run test batch. This starts with a CPU-only static EM, which, despite
> the mdrun variables, runs on a single thread. Any help will be highly
> appreciated.
>
>  md.log below:
>
> GROMACS:  gmx mdrun, version 2019.1
> Executable:   /home/reida/ppc64le/stow/gromacs/bin/gmx
> Data prefix:  /home/reida/ppc64le/stow/gromacs
> Working dir:  /home/smolyan/gmx_test1
> Process ID:   115831
> Command line:
>   gmx mdrun -pin on -pinstride 2 -ntomp 4 -ntmpi 4 -pme cpu -nb cpu -s
> em.tpr -o traj.trr -g md.log -c after_em.pdb
>
> GROMACS version:2019.1
> Precision:  single
> Memory model:   64 bit
> MPI library:thread_mpi
> OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
> GPU support:CUDA
> SIMD instructions:  IBM_VSX
> FFT library:fftw-3.3.8
> RDTSCP usage:   disabled
> TNG support:enabled
> Hwloc support:  hwloc-1.11.8
> Tracing support:disabled
> C compiler: /opt/rh/devtoolset-7/root/usr/bin/cc GNU 7.3.1
> C compiler flags:   -mcpu=power9 -mtune=power9  -mvsx -O2 -DNDEBUG
> -funroll-all-loops -fexcess-precision=fast
> C++ compiler:   /opt/rh/devtoolset-7/root/usr/bin/c++ GNU 7.3.1
> C++ compiler flags: -mcpu=power9 -mtune=power9  -mvsx-std=c++11   -O2
> -DNDEBUG -funroll-all-loops -fexcess-precision=fast
> CUDA compiler:  /usr/local/cuda-10.0/bin/nvcc nvcc: NVIDIA (R) Cuda
> compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on
> Sat_Aug_25_21:10:00_CDT_2018;Cuda compilation tools, release 10.0,
> V10.0.130
> CUDA compiler
>
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;;
>
> -mcpu=power9;-mtune=power9;-mvsx;-std=c++11;-O2;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> CUDA driver:10.10
> CUDA runtime:   10.0
>
>
> Running on 1 node with total 160 cores, 160 logical cores, 1 compatible GPU
> Hardware detected:
>   CPU info:
> Vendor: IBM
> Brand:  POWER9, altivec supported
> Family: 0   Model: 0   Stepping: 0
> Features: vmx vsx
>   Hardware topology: Only logical processor count
>   GPU info:
> Number of GPUs detected: 1
> #0: NVIDIA Tesla V100-SXM2-16GB, compute cap.: 7.0, ECC: yes, stat:
> compatible
>
>
>  PLEASE READ AND CITE THE FOLLOWING REFERENCE 
>
> *SKIPPED*
>
> Input Parameters:
>integrator = steep
>tinit  = 0
>dt = 0.001
>nsteps = 5
>init-step  = 0
>simulation-part= 1
>comm-mode  = Linear
>nstcomm= 100
>bd-fric= 0
>ld-seed= 1941752878
>emtol  = 100
>emstep = 0.01
>niter  = 20
>fcstep = 0
>nstcgsteep = 1000
>nbfgscorr  = 10
>rtpi   = 0.05
>nstxout= 0
>nstvout= 0
>nstfout= 0
>nstlog = 1000
>nstcalcenergy  = 100
>nstenergy  = 1000
>nstxout-compressed = 0
>compressed-x-precision = 1000
>cutoff-scheme  = Verlet
>nstlist= 1
>ns-type= Grid
>pbc= xyz
>periodic-molecules = true
>verlet-buffer-tolerance= 0.005
>rlist  = 1.2
>coulombtype= PME
>coulomb-modifier   = Potential-shift
>rcoulomb-switch= 0
>rcoulomb   = 1.2
>epsilon-r  = 1
>epsilon-rf = inf
>vdw-type   = Cut-off
>vdw-modifier   = Potential-shift
>rvdw-switch= 0
>rvdw   = 1.2
>DispCorr   = No
>table-extension= 1
>fourierspacing = 0.12
>fourier-nx = 52
>fourier-ny = 52
>fourier-nz = 52
>pme-order  = 4
> 

Re: [gmx-users] gmx mdrun with gpu

2019-05-06 Thread Szilárd Páll
Share a log file please so we can see the hardware detected, command line
options, etc.
--
Szilárd


On Sun, May 5, 2019 at 3:53 AM Maryam  wrote:

> Hello Reza
> Yes I complied it with GPU and the version of CUDA is 9.1. Any suggestions?
> Thanks.
>
> On Sat., May 4, 2019, 1:45 a.m. Reza Esmaeeli, 
> wrote:
>
> > Hello Maryam,
> > Have you compiled the gromacs 2019 with GPU?
> > What version of CUDA do you have?
> >
> > - Reza
> >
> > On Saturday, May 4, 2019, Maryam  wrote:
> >
> > > Dear all,
> > > I want to run a simulation in gromacs 2019 on a system with 1 gpu and
> 32
> > > threads. I write this command: gmx mdrun -s md.tpr -v -nb gpu but it
> > seems
> > > it does not recognize gpus and it takes long for the simulation to
> reach
> > > its end (-ntmpi ntomp and nt seem not working either). In gromacs 2016
> > with
> > > 2 gpus, I use gmx_mpi -s md.tpr -v -gpu_id 1 -nb gpu -ntomp 16 -pin on
> > > -tunepme and it works fine, but the same command regardless of (gpu_id)
> > > does not work in gromacs 2019. What flags should I use to get the best
> > > performance of the simulation?
> > > Thank you.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at http://www.gromacs.org/
> > > Support/Mailing_Lists/GMX-Users_List before posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-requ...@gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Gromacs 2019.2 on Power9 + Volta GPUs (building and running)

2019-05-02 Thread Szilárd Páll
Power9 (for HPC) is 4-way SMT, so make sure to try 1,2, and 4 threads per
core (stride 4, 2, and 1 respectively). Especially if you are offloading
all force computing to the GPU, what remains on the couch may not be able
to benefit from more than 1-2 threads per core.


--
Szilárd

On Thu, May 2, 2019, 01:19 Alex  wrote:

> Well, unless something important has changed within a year, I distinctly
> remember being advised here not to offload anything to GPU for EM. Not
> that we ever needed to, to be honest...
>
> In any case, we appear to be dealing with build issues here.
>
> Alex
>
> On 5/1/2019 5:09 PM, Kevin Boyd wrote:
> > Hi,
> >
> >> Of course, i am not. This is the EM. ;)
> > I haven't looked back at the code, but IIRC EM can use GPUs for the
> > nonbondeds, just not the PME. I just double-checked on one of my systems
> > with 10 cores and a GTX 1080 Ti, offloading to the GPU more than doubled
> > the minimization speed.
> >
> > Kevin
> >
> > On Wed, May 1, 2019 at 6:33 PM Alex  wrote:
> >
> >> Of course, i am not. This is the EM. ;)
> >>
> >> On Wed, May 1, 2019, 4:30 PM Kevin Boyd  wrote:
> >>
> >>> Hi,
> >>>
> >>> In addition to what Mark said (and I've also found pinning to be
> critical
> >>> for performance), you're also not using the GPUs with "-pme cpu -nb
> cpu".
> >>>
> >>> Kevin
> >>>
> >>> On Wed, May 1, 2019 at 5:56 PM Alex  wrote:
> >>>
>  Well, my experience so far has been with the EM, because the rest of
> >> the
>  script (with all the dynamic things) needed that to finish. And it
>  "finished" by hitting the wall. However, your comment does touch upon
> >>> what
>  to do with thread pinning and I will try to set '-pin on' throughout
> to
> >>> see
>  if things make a difference for the better. I am less confident about
>  setting strides because it is unclear what the job manager provides in
>  terms of the available core numbers. I will play around some more and
>  report here.
> 
>  Thanks!
> 
>  Alex
> 
>  On Wed, May 1, 2019 at 3:49 PM Mark Abraham  >
>  wrote:
> 
> > Hi,
> >
> > As with x86, GROMACS uses SIMD intrinsics on POWER9 and is thus
> >> fairly
> > insensitive to the compiler's vectorisation abilities. GCC is the
> >> only
> > compiler we've tested, as xlc can't compile simple C++11. As
> >>> everywhere,
> > you should use the latest version of gcc, as IBM spent quite some
> >> years
> > landing improvements for POWER9.
> >
> > EM is useless as a performance indicator of a dynamical simulation,
> >>> avoid
> > that - it runs serial code much much more often.
> >
> > Your run deliberately didn't fill the available cores, so just like
> >> on
>  x86,
> > mdrun will leave the thread affinity handling to the environment,
> >> which
>  is
> > often a path to bad performance. So, if you plan on doing that often,
> > you'll want to check out the mdrun performance guide docs about the
> >>> mdrun
> > -pin and related options.
> >
> > Mark
> >
> >
> > On Wed., 1 May 2019, 23:21 Alex,  wrote:
> >
> >> Hi all,
> >>
> >> Our institution decided to be all fancy, so now we have a bunch of
>  Power9
> >> nodes, each with 80 cores + 4 Volta GPUs. Stuff is managed by
> >> slurm.
> > Today
> >> I did a simple EM ('gmx mdrun -ntomp 4 -ntmpi 4 -pme cpu -nb cpu')
> >>> and
> > the
> >> performance is abysmal, I would guess 100 times slower than on
> >>> anything
> >> I've ever seen before.
> >>
> >> Our admin person emailed me the following:
> >> "-- it would not surprise me if the GCC compilers were relatively
> >> bad
>  at
> >> taking advantage of POWER9 vectorization, they're likely optimized
> >>> for
> >> x86_64 vector stuff like SSE and AVX operations.  This was an issue
> >>> in
> > the
> >> build, I selected "-DGMX_SIMD=IBM_VSX" for the config, but
> >> according
> >>> to
> > my
> >> notes, that was part of an attempt to fix the "unimplemented SIMD"
>  error
> >> that was dogging me at first, and/but which was eventually cleared
> >> by
> >> switching to gcc-6."
> >>
> >> Does anyone have any comments/suggestions on building and running
> >> GMX
>  on
> >> Power9?
> >>
> >> Thank you,
> >>
> >> Alex
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >>
> >>
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.gromacs.org%2FSupport%2FMailing_Lists%2FGMX-Users_Listdata=02%7C01%7Ckevin.boyd%40uconn.edu%7C5ae99d654910469ebe9008d6ce8502d1%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C636923468018052656sdata=zejDS0OvUCl%2BSch%2BzVtxic%2B%2BDFIPEhB1DygmpmQ2dvw%3Dreserved=0
>  before
> >> posting!
> >>
> >> * Can't post? Read
> >>
> 

Re: [gmx-users] Failed tests, need help in troubleshooting

2019-05-01 Thread Szilárd Páll
Hi Cameron,

My strong suspicion is that the NVIDIA OpenCL driver/compiler simply does
not support or is buggy on Turing. I've just checked and an OpenCL build
with the latest 418 drivers and it also fails tests on Volta (which is
similar to the Turing architecture), but it passes on Pascal.

You could verify this by running make check such that only the second
Quadro GPU is utilized, e.g.
$ CUDA_VISIBLE_DEVICES=1 make check

Additionally note that performance with OpenCL on NVIDIA is in general
significantly lower than with CUDA both because NVIDIA's OpenCL support is
rather poor and also some features are not yet fully functional on NVIDIA
OpenCL (that do work on AMD). Hence, if performance matters, e.g.
GPU-accelerated production runs, CUDA is admittedly the better option.

--
Szilárd


On Fri, Apr 26, 2019 at 9:47 AM Cameron Fletcher (CF) <
circumf...@disroot.org> wrote:

> Hi Szilárd,
>
> I am using a Intel Xeon W-2145,
> GPU: Nvidia RTX 2080TI and Nvidia P400
>
>
> cmake log:
>
> https://raw.githubusercontent.com/circumflex-cf/logs/master/cmake_2019-04-23.log
>
> make log:
>
> https://raw.githubusercontent.com/circumflex-cf/logs/master/make_2019-04-23.log
>
> make check logs:
>
> https://raw.githubusercontent.com/circumflex-cf/logs/master/makecheck_2019-04-23.log
>
> Also here is one the regression tests that failed.
> https://github.com/circumflex-cf/logs/tree/master/orientation-restraints
>
>
> --
> CF
>
> On 23/04/19 5:39 PM, Szilárd Páll wrote:
> > Hi Cameron,
> >
> > I meant any log file from a run with the hardware + software combination.
> > The log file contains hardware and software detection output that is
> useful
> > in identifying issues.
> >
> > Do the unit tests pass?
> >
> > --
> > Szilárd
> >
> >
> > On Tue, Apr 23, 2019 at 12:38 PM Cameron Fletcher (CF) <
> > circumf...@disroot.org> wrote:
> >
> >> Hello Szilárd,
> >>
> >> Do you mean log files created in each regression test?
> >>
> >> On 23/04/19 3:43 PM, Szilárd Páll wrote:
> >>> What is the hardware you are running this on? Can you share a log file,
> >>> please?
> >>> --
> >>> Szilárd
> >>>
> >>>
> >>> On Mon, Apr 22, 2019 at 9:24 AM Cameron Fletcher (CF) <
> >>> circumf...@disroot.org> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I have installed gromacs 2019.1 on CentOS 7.6 .
> >>>> While running regressions tests 2019.1 certain tests are failing with
> >>>> errors.
> >>>>
> >>>> I have attached list of some failed tests.
> >>>>
> >>>>
> >>>> Since I am using gcc compilers and openmpi from openhpc repositories
> >>>> the below command was used for cmake.
> >>>>
> >>>> cmake ..
> >>>> -DCMAKE_C_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/gcc
> >>>> -DCMAKE_CXX_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/c++
> >>>> -DGMX_BUILD_OWN_FFTW=ON -DGMX_MPI=on -DGMX_GPU=on -DGMX_USE_OPENCL=on
> >>>>
> >>>> cmake version: 3.13.4
> >>>> gcc version: 7.3.0
> >>>> openmpi3 version: 3.1.0
> >>>>
> >>>>
> >>>> What should I be doing further for more troubleshooting.
> >>>>
> >>>>
> >>>> --
> >>>> CF
> >>>>
> >>>> --
> >>>> Gromacs Users mailing list
> >>>>
> >>>> * Please search the archive at
> >>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>> posting!
> >>>>
> >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>>> send a mail to gmx-users-requ...@gromacs.org.
> >>
> >>
> >> --
> >> CF
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-05-01 Thread Szilárd Páll
Hi,

You can safely ignore the errors as these are caused by properties of your
hardware that the test scripts are not dealing well enough -- though
admittedly, two of the three errors should be avoided along a message
similar to this
"Mdrun cannot use the requested (or automatic) number of OpenMP threads,
retrying with 8."
(which is what I get when I run the tests on a similar machine).

If you need the tests to pass on the node in question, let me know, I can
suggest workarounds.

Cheers,
--
Szilárd


On Mon, Apr 29, 2019 at 6:47 PM Alex  wrote:

> Hi Szilárd,
>
> Since I don't know which directory inside /complex corresponds to which
> tests (at least one of the tests that failed was #42), here's a tarball
> of the entire /complex directory per location you specified below:
>
>
> https://www.dropbox.com/s/44uluopkdan2417/regression_complex_2019.2.tar.gz?dl=0
>
> If you can help us figure this out, it will be great!
>
> Thanks,
>
> Alex
>
> On 4/29/2019 4:25 AM, Szilárd Páll wrote:
> > Hi,
> >
> > I assume you used -DREGRESSIONTEST_DOWNLOAD=ON case in which the tests
> are
> > downloaded and unpacked under
> > BUILD_TREE/tests/regressiontests-release-2019-[SUFFIX]/
> >
> > In that directory you find the usual regressiontests tree, from there
> under
> > complex/ you'll find the tests in question.
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > On Fri, Apr 26, 2019 at 7:00 PM Alex  wrote:
> >
> >> Hi Szilárd,
> >>
> >> I am at a conference right now, but will do my best to upload the
> >> requested data first thing on Monday. In the meantime, could you please
> >> tell me where the stuff of interest would be located within the local
> >> gromacs build directory? I mean, I could make the entire directory a
> >> tarball, but not sure it's all that necessary. I don't remember which
> tests
> >> failed, unfortunately...
> >>
> >> Thank you!
> >>
> >> Alex
> >>
> >> On 4/25/2019 2:54 AM, Szilárd Páll wrote:
> >>> Hi Alex,
> >>>
> >>> On Wed, Apr 24, 2019 at 9:59 PM Alex  wrote:
> >>>
> >>>> Hi Szilárd,
> >>>>
> >>>> We are using neither Ubuntu 18.04, nor glibc 2.27, but the problem is
> >>>> most certainly there.
> >>> OK.
> >>>
> >>> Can you please post the content of the directories of tests that
> failed?
> >> It
> >>> would be useful to know the exact software configuration (reported in
> the
> >>> log) and the details of the errors (reported in the mdrun.out).
> >>>
> >>> Thanks,
> >>> --
> >>> Szilárd
> >>>
> >>>
> >>>
> >>>> Until the issue is solved one way or another, we
> >>>> will be staying with 2018.1, i guess.
> >>>>
> >>>> $ lsb_release -a
> >>>>
> >>>> No LSB modules are available.
> >>>>
> >>>> Distributor ID: Ubuntu
> >>>>
> >>>> Description:Ubuntu 16.04.6 LTS
> >>>>
> >>>> Release:16.04
> >>>>
> >>>> Codename:   xenial
> >>>>
> >>>> Ubuntu GLIBC 2.23-0ubuntu11
> >>>>
> >>>>
> >>>> On 4/24/2019 4:57 AM, Szilárd Páll wrote:
> >>>>> What OS are you using? There are some known issues with the Ubuntu
> >> 18.04
> >>>> +
> >>>>> glibc 2.27 which could explain the errors.
> >>>>> --
> >>>>> Szilárd
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> --
> >>>> Gromacs Users mailing list
> >>>>
> >>>> * Please search the archive at
> >>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>> posting!
> >>>>
> >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>>> send a mail to gmx-users-requ...@gromacs.org.
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-04-30 Thread Szilárd Páll
Thanks, will look at it shortly.

BTW, the regressiontests output indicates which test failed, e.g. from your
earlier email:

FAILED. Check mdrun.out, md.log file(s) in distance_restraints for
distance_restraints




--
Szilárd


On Mon, Apr 29, 2019 at 6:47 PM Alex  wrote:

> Hi Szilárd,
>
> Since I don't know which directory inside /complex corresponds to which
> tests (at least one of the tests that failed was #42), here's a tarball
> of the entire /complex directory per location you specified below:
>
>
> https://www.dropbox.com/s/44uluopkdan2417/regression_complex_2019.2.tar.gz?dl=0
>
> If you can help us figure this out, it will be great!
>
> Thanks,
>
> Alex
>
> On 4/29/2019 4:25 AM, Szilárd Páll wrote:
> > Hi,
> >
> > I assume you used -DREGRESSIONTEST_DOWNLOAD=ON case in which the tests
> are
> > downloaded and unpacked under
> > BUILD_TREE/tests/regressiontests-release-2019-[SUFFIX]/
> >
> > In that directory you find the usual regressiontests tree, from there
> under
> > complex/ you'll find the tests in question.
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > On Fri, Apr 26, 2019 at 7:00 PM Alex  wrote:
> >
> >> Hi Szilárd,
> >>
> >> I am at a conference right now, but will do my best to upload the
> >> requested data first thing on Monday. In the meantime, could you please
> >> tell me where the stuff of interest would be located within the local
> >> gromacs build directory? I mean, I could make the entire directory a
> >> tarball, but not sure it's all that necessary. I don't remember which
> tests
> >> failed, unfortunately...
> >>
> >> Thank you!
> >>
> >> Alex
> >>
> >> On 4/25/2019 2:54 AM, Szilárd Páll wrote:
> >>> Hi Alex,
> >>>
> >>> On Wed, Apr 24, 2019 at 9:59 PM Alex  wrote:
> >>>
> >>>> Hi Szilárd,
> >>>>
> >>>> We are using neither Ubuntu 18.04, nor glibc 2.27, but the problem is
> >>>> most certainly there.
> >>> OK.
> >>>
> >>> Can you please post the content of the directories of tests that
> failed?
> >> It
> >>> would be useful to know the exact software configuration (reported in
> the
> >>> log) and the details of the errors (reported in the mdrun.out).
> >>>
> >>> Thanks,
> >>> --
> >>> Szilárd
> >>>
> >>>
> >>>
> >>>> Until the issue is solved one way or another, we
> >>>> will be staying with 2018.1, i guess.
> >>>>
> >>>> $ lsb_release -a
> >>>>
> >>>> No LSB modules are available.
> >>>>
> >>>> Distributor ID: Ubuntu
> >>>>
> >>>> Description:Ubuntu 16.04.6 LTS
> >>>>
> >>>> Release:16.04
> >>>>
> >>>> Codename:   xenial
> >>>>
> >>>> Ubuntu GLIBC 2.23-0ubuntu11
> >>>>
> >>>>
> >>>> On 4/24/2019 4:57 AM, Szilárd Páll wrote:
> >>>>> What OS are you using? There are some known issues with the Ubuntu
> >> 18.04
> >>>> +
> >>>>> glibc 2.27 which could explain the errors.
> >>>>> --
> >>>>> Szilárd
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> --
> >>>> Gromacs Users mailing list
> >>>>
> >>>> * Please search the archive at
> >>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>> posting!
> >>>>
> >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>>> send a mail to gmx-users-requ...@gromacs.org.
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-04-29 Thread Szilárd Páll
Hi,

I assume you used -DREGRESSIONTEST_DOWNLOAD=ON case in which the tests are
downloaded and unpacked under
BUILD_TREE/tests/regressiontests-release-2019-[SUFFIX]/

In that directory you find the usual regressiontests tree, from there under
complex/ you'll find the tests in question.

Cheers,
--
Szilárd


On Fri, Apr 26, 2019 at 7:00 PM Alex  wrote:

> Hi Szilárd,
>
> I am at a conference right now, but will do my best to upload the
> requested data first thing on Monday. In the meantime, could you please
> tell me where the stuff of interest would be located within the local
> gromacs build directory? I mean, I could make the entire directory a
> tarball, but not sure it's all that necessary. I don't remember which tests
> failed, unfortunately...
>
> Thank you!
>
> Alex
>
> On 4/25/2019 2:54 AM, Szilárd Páll wrote:
> > Hi Alex,
> >
> > On Wed, Apr 24, 2019 at 9:59 PM Alex  wrote:
> >
> >> Hi Szilárd,
> >>
> >> We are using neither Ubuntu 18.04, nor glibc 2.27, but the problem is
> >> most certainly there.
> >
> > OK.
> >
> > Can you please post the content of the directories of tests that failed?
> It
> > would be useful to know the exact software configuration (reported in the
> > log) and the details of the errors (reported in the mdrun.out).
> >
> > Thanks,
> > --
> > Szilárd
> >
> >
> >
> >> Until the issue is solved one way or another, we
> >> will be staying with 2018.1, i guess.
> >>
> >> $ lsb_release -a
> >>
> >> No LSB modules are available.
> >>
> >> Distributor ID: Ubuntu
> >>
> >> Description:Ubuntu 16.04.6 LTS
> >>
> >> Release:16.04
> >>
> >> Codename:   xenial
> >>
> >> Ubuntu GLIBC 2.23-0ubuntu11
> >>
> >>
> >> On 4/24/2019 4:57 AM, Szilárd Páll wrote:
> >>> What OS are you using? There are some known issues with the Ubuntu
> 18.04
> >> +
> >>> glibc 2.27 which could explain the errors.
> >>> --
> >>> Szilárd
> >>>
> >>>
> >>>
> >>>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] clFFT error on iMAC 2017, Gromacs 2019.2, Intel Core i5, GPU Radeon Pro 555 2GB

2019-04-25 Thread Szilárd Páll
Hi,

That unfortunately looks like Apple's OpenCL not playing well with the
clFFT OpenCL library.
Avoiding offloading PME to the GPU will allow using GPU acceleration, I
think. Can you please try to run a simulation manually and pass "-pme cpu"
on the command line?

Can you also please file a report with the above content on
redmine.gromacs.org so we can follow up and address the issue in a future
release?

Cheers,
--
Szilárd


On Thu, Apr 25, 2019 at 5:48 AM Duy Tran Phuoc  wrote:

> Hi,
> Recently I compile new 2019.2 on the new iMAC with GPU Radeon Pro 555 2GB.
> Below is the compilation:
> ```
> curl http://ftp.gromacs.org/pub/gromacs/gromacs-2019.2.tar.gz --output
> gromacs-2019.2.tar.gz
> tar -xvzf gromacs-2019.2.tar.gz
> mkdir ~/gromacs
> cd ~/gromacs-2019.2
> mkdir build
> cd build
> ~/cmake/bin/cmake ../ -DGMX_MPI=ON -DGMX_GPU=ON -DGMX_CLANG_CUDA=ON
> -DGMX_USE_OPENCL=ON -DGMX_FFT_LIBRARY=fftpack -DGMX_SIMD=AVX2_256
> -DCMAKE_INSTALL_PREFIX=~/gromacs
> make -j 4
> make install
> ```
> The same situation for fftw3 compile by GROMACS.
> When I run MD with GPU, the simulation is unstable comparing to the normal
> all-on-CPU.
>
> I came back to make check to check my compilation, below is output:
>
> ```
>
> 93% tests passed, 3 tests failed out of 40
>
>
> Label Time Summary:
>
> GTest  = 156.88 sec*proc (40 tests)
>
> IntegrationTest=  60.77 sec*proc (5 tests)
>
> MpiTest=   1.92 sec*proc (3 tests)
>
> SlowTest   =  37.90 sec*proc (1 test)
>
> UnitTest   =  58.21 sec*proc (34 tests)
>
>
> Total Test time (real) = 157.81 sec
>
>
> The following tests FAILED:
>
>  29 - GmxPreprocessTests (Timeout)
>
>  37 - MdrunTests (Failed)
>
>  40 - MdrunMpiTests (Failed)
>
> Errors while running CTest
>
> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
>
> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
>
> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> ```
> All the failed tests show the error as below:
>
> ```
>
> Program: mdrun-mpi-test, version 2019.2
>
> Source file: src/gromacs/ewald/pme-gpu-3dfft-ocl.cpp (line 62)
>
> Function:void handleClfftError(clfftStatus, const char *)
>
>
> Internal error (bug):
>
> clFFT execution failure: -57
>
>
> For more information and tips for troubleshooting, please check the GROMACS
>
> website at http://www.gromacs.org/Documentation/Errors
> ```
>
> Any suggestion whether this is a bug or something else?
> Thank you a lot for consideration.
>
> Tran Phuoc Duy.
> Tokyo Institute of Technology.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-04-25 Thread Szilárd Páll
Hi Alex,

On Wed, Apr 24, 2019 at 9:59 PM Alex  wrote:

> Hi Szilárd,
>
> We are using neither Ubuntu 18.04, nor glibc 2.27, but the problem is
> most certainly there.


OK.

Can you please post the content of the directories of tests that failed? It
would be useful to know the exact software configuration (reported in the
log) and the details of the errors (reported in the mdrun.out).

Thanks,
--
Szilárd



> Until the issue is solved one way or another, we
> will be staying with 2018.1, i guess.
>

> $ lsb_release -a
>
> No LSB modules are available.
>
> Distributor ID: Ubuntu
>
> Description:Ubuntu 16.04.6 LTS
>
> Release:16.04
>
> Codename:   xenial
>
> Ubuntu GLIBC 2.23-0ubuntu11
>
>
> On 4/24/2019 4:57 AM, Szilárd Páll wrote:
> > What OS are you using? There are some known issues with the Ubuntu 18.04
> +
> > glibc 2.27 which could explain the errors.
> > --
> > Szilárd
> >
> >
> >
> >
> >>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-04-24 Thread Szilárd Páll
What OS are you using? There are some known issues with the Ubuntu 18.04 +
glibc 2.27 which could explain the errors.
--
Szilárd


On Wed, Apr 17, 2019 at 2:32 AM Alex  wrote:

> Okay, more interesting things are happening.
> At the end of 'make' I get a bunch of things like
>
> .. nbnxn_cuda.cu(373): warning: variable "dim_grid" was
> declared but never referenced
> -bash: syntax error near unexpected token `373'
>
> More errors during 'make check' right after " Building NVCC (Device) object
>
> src/gromacs/CMakeFiles/libgromacs.dir/mdlib/nbnxn_cuda/libgromacs_generated_nbnxn_cuda.cu.o"
> then continues as if nothing happened.
>
> Finally, fails test #42 with what appears to be sarcasm:
> "...
> Thanx for Using GROMACS - Have a Nice Day
>
> Mdrun cannot use the requested (or automatic) number of ranks, retrying
> with 8.
>
> Abnormal return value for ' gmx mdrun-nb cpu   -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun -ntmpi 1  -notunepme >mdrun.out
> 2>&1' was -1
> FAILED. Check mdrun.out, md.log file(s) in distance_restraints for
> distance_restraints
>
> Abnormal return value for ' gmx mdrun -ntmpi 6  -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun   -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in octahedron for octahedron
> Re-running orientation-restraints using CPU-based PME
>
> Abnormal return value for ' gmx mdrun -ntmpi 1  -pme cpu-notunepme
> >mdrun.out 2>&1' was -1
> FAILED. Check mdrun.out, md.log file(s) in orientation-restraints/pme-cpu
> for orientation-restraints-pme-cpu
> Re-running pull_geometry_angle using CPU-based PME
> Re-running pull_geometry_angle-axis using CPU-based PME
> Re-running pull_geometry_dihedral using CPU-based PME
> 3 out of 55 complex tests FAILED"
>
> Then finally quits for good. Your suggestions will be highly appreciated.
>
> Thank you,
>
> Alex
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] 2019.2 build warnings

2019-04-24 Thread Szilárd Páll
The warning are harmless, something happened in the build infrastructure
which emits some new warnings that we've not caught before the release.
--
Szilárd


On Wed, Apr 17, 2019 at 1:43 AM Alex  wrote:

> Hi all,
>
> I am building the 2019.2 version, latest CUDA libs (older 2018 version
> works fine). While building, I am getting a ton of warnings as shown below,
> while the build does not terminate. Is this okay, or do we need to do
> anything at this point?
>
> Thanks,
>
> Alex
>
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_block" was declared but never referenced
>
> localstuff/src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu(373): warning:
> variable "dim_grid" was declared but never referenced
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Gromacs Benchmarks for NVIDIA GeForce RTX 2080

2019-04-24 Thread Szilárd Páll
The benchmark systems are the ones commonly used in GROMACS performance
evaluation
ADH is a 90k/134k system (dodec/cubic) and RNAse is 19k/24k (dodec/cubic)
both setup up with AMBER FF standard setting (referenced can be found on
this admittedly dated page: http://www.gromacs.org/GPU_acceleration)

--
Szilárd


On Thu, Apr 18, 2019 at 8:24 PM Soham Sarkar  wrote:

> Could you please tell me.. how big is your systrm.. How many atoms are
> there?
> Soham
>
> On Thu, 18 Apr 2019, 10:51 pm Jason Hogrefe,  >
> wrote:
>
> > Dear Gromacs Users,
> >
> > Exxact corporation has conducted benchmarks for Gromacs using NVIDIA RTX
> > 2080 GPUs. We ran them a few months back, but thought the community would
> > be interested in such numbers.
> >
> > System: Exxact TensorEX Gromacs Certified Workstation<
> > https://www.exxactcorp.com/GROMACS-Certified-GPU-Systems>
> > CPU: Intel Xeon Scalable Family Silver 4114 (Skylake) x2
> > GPU: NVIDIA GeForce RTX 2080 x4
> > CUDA: 9.2
> > Gromacs Version: Gromacs 2018.3
> >
> > ==
> >   # Running ADH Benchmarks #
> >
> >   - ADH cubic PME -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 60.385  ns/day
> > 40 CPUs + [1] 1 x GPU: 70.547  ns/day
> > 40 CPUs + [2] 1 x GPU: 60.444  ns/day
> > 40 CPUs + [3] 1 x GPU: 70.753  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU: 53.474  ns/day
> > 10 CPUs + [1] 1 x GPU: 44.991  ns/day
> > 10 CPUs + [2] 1 x GPU: 45.034  ns/day
> > 10 CPUs + [3] 1 x GPU: 45.853  ns/day
> > Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 38.128  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 39.226  ns/day
> >
> >- ADH cubic RF -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 74.364  ns/day
> > 40 CPUs + [1] 1 x GPU: 73.903  ns/day
> > 40 CPUs + [2] 1 x GPU: 74.022  ns/day
> > 40 CPUs + [3] 1 x GPU: 74.105  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU:  10 CPUs + [1] 1 x GPU:  10 CPUs + [2] 1 x GPU:
> 10
> > CPUs + [3] 1 x GPU: Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 96.189  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 102.489  ns/day
> >
> >- ADH cubic vsites PME -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 132.120  ns/day
> > 40 CPUs + [1] 1 x GPU: 129.414  ns/day
> > 40 CPUs + [2] 1 x GPU: 129.661  ns/day
> > 40 CPUs + [3] 1 x GPU: 133.058  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU: 108.044  ns/day
> > 10 CPUs + [1] 1 x GPU: 90.935  ns/day
> > 10 CPUs + [2] 1 x GPU: 103.922  ns/day
> > 10 CPUs + [3] 1 x GPU: 95.532  ns/day
> > Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 75.409  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 86.649  ns/day
> >
> > - ADH cubic vsites RF -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 156.230  ns/day
> > 40 CPUs + [1] 1 x GPU: 155.725  ns/day
> > 40 CPUs + [2] 1 x GPU: 155.798  ns/day
> > 40 CPUs + [3] 1 x GPU: 156.289  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU:  10 CPUs + [1] 1 x GPU:  10 CPUs + [2] 1 x GPU:
> 10
> > CPUs + [3] 1 x GPU: Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 194.495  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 203.785  ns/day
> >
> > - ADH dodec PME -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 85.505  ns/day
> > 40 CPUs + [1] 1 x GPU: 84.418  ns/day
> > 40 CPUs + [2] 1 x GPU: 84.560  ns/day
> > 40 CPUs + [3] 1 x GPU: 85.463  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU: 55.158  ns/day
> > 10 CPUs + [1] 1 x GPU: 54.666  ns/day
> > 10 CPUs + [2] 1 x GPU: 49.706  ns/day
> > 10 CPUs + [3] 1 x GPU: 52.324  ns/day
> > Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 44.456  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 39.953  ns/day
> >
> >  - ADH dodec RF -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 77.585  ns/day
> > 40 CPUs + [1] 1 x GPU: 77.924  ns/day
> > 40 CPUs + [2] 1 x GPU: 78.122  ns/day
> > 40 CPUs + [3] 1 x GPU: 78.215  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU:  10 CPUs + [1] 1 x GPU:  10 CPUs + [2] 1 x GPU:
> 10
> > CPUs + [3] 1 x GPU: Sequential Multi GPU Run Performance
> > 40 CPUs + [0,1] 2 x GPU: 102.690  ns/day
> > 40 CPUs + [0,1,2,3] 4 x GPU: 112.896  ns/day
> >
> >  - ADH dodec vsites PME -
> > Sequential Single GPU Run Performance
> > 40 CPUs + [0] 1 x GPU: 149.222  ns/day
> > 40 CPUs + [1] 1 x GPU: 148.763  ns/day
> > 40 CPUs + [2] 1 x GPU: 150.029  ns/day
> > 40 CPUs + [3] 1 x GPU: 149.848  ns/day
> > Multiple Single GPU Run Performance
> > 10 CPUs + [0] 1 x GPU: 124.922  ns/day
> > 10 CPUs + [1] 1 x GPU: 108.062  ns/day
> > 10 CPUs + [2] 1 x GPU: 108.633  ns/day
> > 10 CPUs + [3] 1 x GPU: 110.386  ns/day
> > Sequential Multi GPU Run Performance
> > 40 

Re: [gmx-users] Failed tests, need help in troubleshooting

2019-04-23 Thread Szilárd Páll
Hi Cameron,

I meant any log file from a run with the hardware + software combination.
The log file contains hardware and software detection output that is useful
in identifying issues.

Do the unit tests pass?

--
Szilárd


On Tue, Apr 23, 2019 at 12:38 PM Cameron Fletcher (CF) <
circumf...@disroot.org> wrote:

> Hello Szilárd,
>
> Do you mean log files created in each regression test?
>
> On 23/04/19 3:43 PM, Szilárd Páll wrote:
> > What is the hardware you are running this on? Can you share a log file,
> > please?
> > --
> > Szilárd
> >
> >
> > On Mon, Apr 22, 2019 at 9:24 AM Cameron Fletcher (CF) <
> > circumf...@disroot.org> wrote:
> >
> >> Hello,
> >>
> >> I have installed gromacs 2019.1 on CentOS 7.6 .
> >> While running regressions tests 2019.1 certain tests are failing with
> >> errors.
> >>
> >> I have attached list of some failed tests.
> >>
> >>
> >> Since I am using gcc compilers and openmpi from openhpc repositories
> >> the below command was used for cmake.
> >>
> >> cmake ..
> >> -DCMAKE_C_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/gcc
> >> -DCMAKE_CXX_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/c++
> >> -DGMX_BUILD_OWN_FFTW=ON -DGMX_MPI=on -DGMX_GPU=on -DGMX_USE_OPENCL=on
> >>
> >> cmake version: 3.13.4
> >> gcc version: 7.3.0
> >> openmpi3 version: 3.1.0
> >>
> >>
> >> What should I be doing further for more troubleshooting.
> >>
> >>
> >> --
> >> CF
> >>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
>
>
> --
> CF
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] Failed tests, need help in troubleshooting

2019-04-23 Thread Szilárd Páll
What is the hardware you are running this on? Can you share a log file,
please?
--
Szilárd


On Mon, Apr 22, 2019 at 9:24 AM Cameron Fletcher (CF) <
circumf...@disroot.org> wrote:

> Hello,
>
> I have installed gromacs 2019.1 on CentOS 7.6 .
> While running regressions tests 2019.1 certain tests are failing with
> errors.
>
> I have attached list of some failed tests.
>
>
> Since I am using gcc compilers and openmpi from openhpc repositories
> the below command was used for cmake.
>
> cmake ..
> -DCMAKE_C_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/gcc
> -DCMAKE_CXX_COMPILER=/opt/ohpc/pub/compiler/gcc/7.3.0/bin/c++
> -DGMX_BUILD_OWN_FFTW=ON -DGMX_MPI=on -DGMX_GPU=on -DGMX_USE_OPENCL=on
>
> cmake version: 3.13.4
> gcc version: 7.3.0
> openmpi3 version: 3.1.0
>
>
> What should I be doing further for more troubleshooting.
>
>
> --
> CF
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-04-10 Thread Szilárd Páll
Hi,
On Wed, Apr 10, 2019 at 4:19 PM Tafelmeier, Stefanie <
stefanie.tafelme...@zae-bayern.de> wrote:

> Dear Szilárd and Jon,
>
> many thanks for your support.
>
> The system was Ubuntu 18.04 LTS, gcc 7.3 and CUDA 9.2.
> We upgraded now gcc (to 8.2) and CUDA (to 10.1).
>
> Now the regressiontests all pass.
> Also the tests Szilárd ask before are all running. Even just using mdrun
> -nt 80 works.
>

Great, this confirms that there was indeed a strange compatibility issue as
Jon suggested.

Many thanks! It seems that this was the origin of the problem.
>
> Just to be sure, I would like to have a look at the short range value of
> the complex test. As before some passed even without having the right
> values.
>

What do you mean by that?


> Is there a way to compare or a list with the correct outcome?
>

When the regressiontests are executed, the output by default lists all
commands that do the test runs as well as those that verify the outputs,
e.g.

$ perl gmxtest.pl complex
[...]
Testing acetonitrilRF . . . gmx grompp -f ./grompp.mdp -c ./conf -r ./conf
-p ./topol -maxwarn 10  >grompp.out 2>grompp.err
gmx check -s1 ./reference_s.tpr -s2 topol.tpr -tol 0.0001 -abstol 0.001
>checktpr.out 2>checktpr.err
 gmx mdrun-nb cpu   -notunepme >mdrun.out 2>&1
gmx check -e ./reference_s.edr -e2 ener.edr -tol 0.001 -abstol 0.05
-lastener Potential >checkpot.out 2>checkpot.err
gmx check -f ./reference_s.trr -f2 traj.trr -tol 0.001 -abstol 0.05
>checkforce.out 2>checkforce.err
PASSED but check mdp file differences

The gmx check commands do the checking and the the reference_s|d files to
comapre against.

--
Szilárd


> Anyway, here is the link to the tar-ball of the complex folder in case
> there is interest:
> https://it-service.zae-bayern.de/Team/index.php/s/mMyt3MPEfRrn8Ge
>
> Many thanks again for your help.
>
> Best wishes,
> Steffi
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se] Im Auftrag von
> Jonathan Vincent
> Gesendet: Dienstag, 9. April 2019 22:13
> An: gmx-us...@gromacs.org
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi,
>
> Which operating system are you running on? We have seen some strange
> behavior with large number of threads, gcc 7.3 and a newish version of
> glibc. Specifically the default combination that comes with Ubuntu 18.04
> LTS, but it might be more generic than that.
>
> My suggestion would be to update to gcc 8.3 and CUDA 10.1 (which is
> required for CUDA support of gcc 8), which seemed to fix the problem in
> that case.
>
> If you still have problems we can look at this some more.
>
> Jon
>
> -Original Message-
> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd
> Páll
> Sent: 09 April 2019 20:08
> To: Discussion list for GROMACS users 
> Subject: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi,
>
> One more test I realized it may be relevant considering that we had a
> similar report earlier this year on similar CPU hardware:
> can you please compile with -DGMX_SIMD=AVX2_256 and rerun the tests?
>
> --
> Szilárd
>
>
> On Tue, Apr 9, 2019 at 8:35 PM Szilárd Páll 
> wrote:
>
> > Dear Stefanie,
> >
> > On Fri, Apr 5, 2019 at 11:48 AM Tafelmeier, Stefanie <
> > stefanie.tafelme...@zae-bayern.de> wrote:
> >
> >> Hi Szilárd,
> >>
> >> thanks for your advices.
> >> I performed the tests.
> >> Both performed without errors.
> >>
> >
> > OK, that excludes simple and obvious issues.
> > Wild guess, but can you run those again, but this time prefix the
> > command with "taskset -c 22-32"
> > ? This makes the tests use cores 22-32 just to check if using a
> > specific set of cores may somehow trigger an error.
> >
> > What CUDA version did you use to compiler the memtest tool -- was it
> > the same (CUDA 9.2) as the one used for building GROMACS?
> >
> > Just to get it right; I have to ask in more detail, because the
> > connection
> >> between is the CPU/GPU and calculation distribution is still a bit
> >> blurry to me:
> >>
> >> If the output of the regressiontests show that the test crashes after
> >> 1-2 steps, this means there is an issue between the transfer between
> >> the CPU and GPU?
> >> As far as I got the short range calculation part is normally split
> >> into nonbonded -> GPU and bonded -> CPU?
> >>
> >
> > The -nb/

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-04-09 Thread Szilárd Páll
Hi,

One more test I realized it may be relevant considering that we had a
similar report earlier this year on similar CPU hardware:
can you please compile with -DGMX_SIMD=AVX2_256 and rerun the tests?

--
Szilárd


On Tue, Apr 9, 2019 at 8:35 PM Szilárd Páll  wrote:

> Dear Stefanie,
>
> On Fri, Apr 5, 2019 at 11:48 AM Tafelmeier, Stefanie <
> stefanie.tafelme...@zae-bayern.de> wrote:
>
>> Hi Szilárd,
>>
>> thanks for your advices.
>> I performed the tests.
>> Both performed without errors.
>>
>
> OK, that excludes simple and obvious issues.
> Wild guess, but can you run those again, but this time prefix the command
> with
> "taskset -c 22-32"
> ? This makes the tests use cores 22-32 just to check if using a specific
> set of cores may somehow trigger an error.
>
> What CUDA version did you use to compiler the memtest tool -- was it the
> same (CUDA 9.2) as the one used for building GROMACS?
>
> Just to get it right; I have to ask in more detail, because the connection
>> between is the CPU/GPU and calculation distribution is still a bit blurry
>> to me:
>>
>> If the output of the regressiontests show that the test crashes after 1-2
>> steps, this means there is an issue between the transfer between the CPU
>> and GPU?
>> As far as I got the short range calculation part is normally split into
>> nonbonded -> GPU and bonded -> CPU?
>>
>
> The -nb/-pme/-bonded flags control which tasks executes where (if not
> specified defaults control this); the output contains a report which
> summarizes where the major force tasks are executed, e.g. this is from one
> of your log files which tells that PP (i.e. particle tasks like short-range
> nonbonded) and the full PME tasks are offloaded to a GPU with ID 0 (and to
> check which GPU is that you can look at the "Hardware detection" section of
> the log):
>
> 1 GPU selected for this run.
> Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
>   PP:0,PME:0
> PP tasks will do (non-perturbed) short-ranged interactions on the GPU
> PME tasks will do all aspects on the GPU
>
> For more details, please see
> http://manual.gromacs.org/documentation/2019.1/user-guide/mdrun-performance.html#running-mdrun-with-gpus
>
> We have seen two types of errors so far:
> - "Asynchronous H2D copy failed: invalid argument" which is still
> mysterious to me and has showed up both in your repeated manual runs as
> well as the regressiontest; as this aborts the run
> - Failing regressiontests with either invalid results or crashes (below
> above abort): to be honest I do not know what causes these but given that
> results
>
> The latter errors indicate incorrect results, in your last "complex" tests
> tarball I saw some tests failing with LINCS errors (and indicating NaN
> values) and a good fraction of tests failing with a GPU-side assertions --
> both of which suggest that things do go wrong on the GPU.
>
> And does this mean that maybe also the calculation I do, have wrong
>> energies? Can I trust my results?
>>
>
> At this point I can unfortunately not recommend running production
> simulations on this machine.
>
> Will try to continue exploring the possible errors and I hope you can help
> out with some test:
>
> - Please run the complex regressiontests (using the RelWithAssert binary)
> by setting the CUDA_LAUNCH_BLOCKING environment variable. This may allow us
> to reason better about the source of the errors. Also you can reconfigure
> with cmake -DGMX_OPENMP_MAX_THREADS=128 to avoid the 88 OpenMP thread
> errors in tests that you encountered yourself.
>
> - Can you please update compiler GROMACS with CUDA 10 and check if either
> of two kinds of errors does reproduce. (If it does, if you can upgrade the
> driver I suggest upgrading to CUDA 10.1).
>
>
>
>>
>> Many thanks again for your support.
>> Best wishes,
>> Steffi
>>
>>
> --
> Szilárd
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-04-09 Thread Szilárd Páll
Dear Stefanie,

On Fri, Apr 5, 2019 at 11:48 AM Tafelmeier, Stefanie <
stefanie.tafelme...@zae-bayern.de> wrote:

> Hi Szilárd,
>
> thanks for your advices.
> I performed the tests.
> Both performed without errors.
>

OK, that excludes simple and obvious issues.
Wild guess, but can you run those again, but this time prefix the command
with
"taskset -c 22-32"
? This makes the tests use cores 22-32 just to check if using a specific
set of cores may somehow trigger an error.

What CUDA version did you use to compiler the memtest tool -- was it the
same (CUDA 9.2) as the one used for building GROMACS?

Just to get it right; I have to ask in more detail, because the connection
> between is the CPU/GPU and calculation distribution is still a bit blurry
> to me:
>
> If the output of the regressiontests show that the test crashes after 1-2
> steps, this means there is an issue between the transfer between the CPU
> and GPU?
> As far as I got the short range calculation part is normally split into
> nonbonded -> GPU and bonded -> CPU?
>

The -nb/-pme/-bonded flags control which tasks executes where (if not
specified defaults control this); the output contains a report which
summarizes where the major force tasks are executed, e.g. this is from one
of your log files which tells that PP (i.e. particle tasks like short-range
nonbonded) and the full PME tasks are offloaded to a GPU with ID 0 (and to
check which GPU is that you can look at the "Hardware detection" section of
the log):

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
  PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PME tasks will do all aspects on the GPU

For more details, please see
http://manual.gromacs.org/documentation/2019.1/user-guide/mdrun-performance.html#running-mdrun-with-gpus

We have seen two types of errors so far:
- "Asynchronous H2D copy failed: invalid argument" which is still
mysterious to me and has showed up both in your repeated manual runs as
well as the regressiontest; as this aborts the run
- Failing regressiontests with either invalid results or crashes (below
above abort): to be honest I do not know what causes these but given that
results

The latter errors indicate incorrect results, in your last "complex" tests
tarball I saw some tests failing with LINCS errors (and indicating NaN
values) and a good fraction of tests failing with a GPU-side assertions --
both of which suggest that things do go wrong on the GPU.

And does this mean that maybe also the calculation I do, have wrong
> energies? Can I trust my results?
>

At this point I can unfortunately not recommend running production
simulations on this machine.

Will try to continue exploring the possible errors and I hope you can help
out with some test:

- Please run the complex regressiontests (using the RelWithAssert binary)
by setting the CUDA_LAUNCH_BLOCKING environment variable. This may allow us
to reason better about the source of the errors. Also you can reconfigure
with cmake -DGMX_OPENMP_MAX_THREADS=128 to avoid the 88 OpenMP thread
errors in tests that you encountered yourself.

- Can you please update compiler GROMACS with CUDA 10 and check if either
of two kinds of errors does reproduce. (If it does, if you can upgrade the
driver I suggest upgrading to CUDA 10.1).



>
> Many thanks again for your support.
> Best wishes,
> Steffi
>
>
--
Szilárd


>
>
>
> -Ursprüngliche Nachricht-
> Von: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se] Im Auftrag von Szilárd
> Páll
> Gesendet: Freitag, 29. März 2019 01:24
> An: Discussion list for GROMACS users
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi,
>
> The standard output of the first set of runs is also something I was
> interested in, but I've found the equivalent in the
> complex/TESTDIR/mdrun.out files. What I see in the regresiontests output is
> that the forces/energies results are simply not correct; some tests simply
> crash after 1-2 steps, but others do complete (like the nbnxn-free-energy/)
> and the short-range energies a clearly far off.
>
> I suggest to try to check if there may be hardware issue:
>
> - run this memory testing tool:
> git clone
> https://github.com/ComputationalRadiationPhysics/cuda_memtest.git
> cd cuda_memtest
> make cuda_memtest CFLAGS='-arch sm_30 -DSM_20 -O3 -DENABLE_NVML=0'
> ./cuda_memtest
>
> - compile and run the gpu-burn tool:
> git clone https://github.com/wilicc/gpu-burn
> cd gpu-burn
> make
> then run
> gpu-burn 300
> to test for 5 minutes.
>
> --
> Szilárd
>
>
> On Thu, Mar 28, 2019 at 3:46 PM Tafelmeier, Stefanie <
> stefanie.tafelme...@zae-bayern.de

Re: [gmx-users] Installation with CUDA on Debian / gcc 6+

2019-04-01 Thread Szilárd Páll
On Mon, Apr 1, 2019 at 5:08 PM Jochen Hub  wrote:

> Hi Åke,
>
> ah, thanks, we had indeed a CUDA 8.0 on our Debian. So we'll try to
> install CUA 10.1.
>
> But as a side question: Doesn't the supported gcc version strongly
> depend on the Linux distribution, see here:
>
> https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html


On paper yes in practice not so much.

The officially listed "qualified" combinations are not strict (and hard)
requirement-combinations; as long as the CUDA dkms compiles for your kernel
and the nvcc works with the gcc compiler you provide it, things will
generally work. Kernels or compilers shipped by a distro can deviate enough
from others that issues may arise, but those cases are not overly common
(as far as I know, though admittedly I don't maintain diverse
infrastructure).

By the way, your distro is not "qualified" at all ;)

--
Szilárd

Thanks,
> Jochen
>
>
> Am 01.04.19 um 16:52 schrieb Åke Sandgren:
> > Use a newer version of CUDA?
> >
> > CUDA 10.1 supports GCC 8.
> >
> > On 4/1/19 4:33 PM, Jochen Hub wrote:
> >> Hi all,
> >>
> >> we try to install Gromacs with CUDA support on a Debian system. Cuda
> >> complains about the gcc 6.30 naively installed on Debian, since Cuda
> >> supports gcc only until gcc 5.
> >>
> >> The problem is that Debian removed packages for gcc-5, so installing an
> >> older gcc is more tedious.
> >>
> >> We understand that CUDA support for gcc strongly depends on the Linux
> >> Distribution, see
> >>
> >> https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
> >>
> >> Therefore: Is there any workaround to compile Gromacs with CUDA under
> >> Debian with a gcc 6+ ?
> >>
> >> Thanks a lot,
> >> Jochen
> >>
> >>
> >
>
> --
> ---
> Dr. Jochen Hub
> Computational Molecular Biophysics Group
> Institute for Microbiology and Genetics
> Georg-August-University of Göttingen
> Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
> Phone: +49-551-39-14189
> http://cmb.bio.uni-goettingen.de/
> ---
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-03-29 Thread Szilárd Páll
Hi,

The standard output of the first set of runs is also something I was
interested in, but I've found the equivalent in the
complex/TESTDIR/mdrun.out files. What I see in the regresiontests output is
that the forces/energies results are simply not correct; some tests simply
crash after 1-2 steps, but others do complete (like the nbnxn-free-energy/)
and the short-range energies a clearly far off.

I suggest to try to check if there may be hardware issue:

- run this memory testing tool:
git clone https://github.com/ComputationalRadiationPhysics/cuda_memtest.git
cd cuda_memtest
make cuda_memtest CFLAGS='-arch sm_30 -DSM_20 -O3 -DENABLE_NVML=0'
./cuda_memtest

- compile and run the gpu-burn tool:
git clone https://github.com/wilicc/gpu-burn
cd gpu-burn
make
then run
gpu-burn 300
to test for 5 minutes.

--
Szilárd


On Thu, Mar 28, 2019 at 3:46 PM Tafelmeier, Stefanie <
stefanie.tafelme...@zae-bayern.de> wrote:

> Hi Szilárd,
>
> Thanks again!
>
> Regarding the test:
>   -ntmpi 1 -ntomp 22 -pin on -pinstride 1:  2 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/XEQrYqq4pikGmMy  /
> https://it-service.zae-bayern.de/Team/index.php/s/YBdKKJ9c7zQpEg9
> Including:
>   -nsteps 0 -nb gpu -pme cpu -bonded cpu:   0 run
> https://it-service.zae-bayern.de/Team/index.php/s/YiByc7iXW5AW9ZX
>   -nsteps 0 -nb gpu -pme gpu -bonded cpu:   2 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/JNPXQnEgYtTAxGj   /
> https://it-service.zae-bayern.de/Team/index.php/s/6aq6BQwwbBELqWe
>   -nsteps 0 -nb gpu -pme gpu -bonded gpu:   0 run
> https://it-service.zae-bayern.de/Team/index.php/s/yj4RAqPMFsDNgTc
>
> Including:
>   -ntmpi 1 -ntomp 22 -pin on -pinstride 2:  1 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/q5jHbdJ2EygtDaQ  /
> https://it-service.zae-bayern.de/Team/index.php/s/sRPccwHRxojW9J8
>   -nsteps 0 -nb gpu -pme cpu -bonded cpu:   0 run
> https://it-service.zae-bayern.de/Team/index.php/s/GdKk5N68CY7BGxJ
>   -nsteps 0 -nb gpu -pme gpu -bonded cpu:   1 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/orwzKJMampWwDo5  /
> https://it-service.zae-bayern.de/Team/index.php/s/JXApT4tFtxQWxG6
>   -nsteps 0 -nb gpu -pme gpu -bonded gpu:   0 run
> https://it-service.zae-bayern.de/Team/index.php/s/8YKK7Zxax22RfGQ
>
> Including:
>   -ntmpi 1 -ntomp 22 -pin on -pinstride 4:  1 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/szZjzaxmwfimrgB  /
> https://it-service.zae-bayern.de/Team/index.php/s/QdTd2an9dbE9BSt
>   -nsteps 0 -nb gpu -pme cpu -bonded cpu:   3 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/DPoqKrgcWfF5PKM  /
> https://it-service.zae-bayern.de/Team/index.php/s/3NbsGHtCPsf7zFS
>   -nsteps 0 -nb gpu -pme gpu -bonded cpu:   3 out of 5 run
> https://it-service.zae-bayern.de/Team/index.php/s/WqP4tXjrR8i3455  /
> https://it-service.zae-bayern.de/Team/index.php/s/DACGc86xxKR6pWs
>   -nsteps 0 -nb gpu -pme gpu -bonded gpu:   0 run
> https://it-service.zae-bayern.de/Team/index.php/s/3nKdwA28KySLEdB
>
>
> Regarding the regressiontest:
> Here is the link to the tarball:
> https://it-service.zae-bayern.de/Team/index.php/s/mMyt3MPEfRrn8Ge
>
>
> Thanks again for all your support and fingers crossed!
>
> Best wishes,
> Steffi
>
>
>
>
>
> -----Ursprüngliche Nachricht-
> Von: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se] Im Auftrag von Szilárd
> Páll
> Gesendet: Mittwoch, 27. März 2019 20:27
> An: Discussion list for GROMACS users
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi Steffi,
>
> On Wed, Mar 27, 2019 at 1:08 PM Tafelmeier, Stefanie <
> stefanie.tafelme...@zae-bayern.de> wrote:
>
> > Hi Szilárd,
> >
> > thanks again!
> > Here are the links for the log files, that didn't run:
> > Old patch:
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> > https://it-service.zae-bayern.de/Team/index.php/s/b4AYiMCoHeNgJH3
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> > https://it-service.zae-bayern.de/Team/index.php/s/JEP2iwFFZCebZLF
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 4:one out of 5 ran
> > https://it-service.zae-bayern.de/Team/index.php/s/apra2zS7FHdqDQy
> >
> > New patch:
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> > https://it-service.zae-bayern.de/Team/index.php/s/jAD52jBgNddrS3w
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> > https://it-service.zae-bayern.de/Team/index.php/s/bcRjtz7r9NekzKB
> >  -ntmpi 1 -ntomp 22 -pin on -pinstride 4:none ran*
> > https://it-service.zae-bayern.

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-03-27 Thread Szilárd Páll
Hi Steffi,

On Wed, Mar 27, 2019 at 1:08 PM Tafelmeier, Stefanie <
stefanie.tafelme...@zae-bayern.de> wrote:

> Hi Szilárd,
>
> thanks again!
> Here are the links for the log files, that didn't run:
> Old patch:
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> https://it-service.zae-bayern.de/Team/index.php/s/b4AYiMCoHeNgJH3
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> https://it-service.zae-bayern.de/Team/index.php/s/JEP2iwFFZCebZLF
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 4:one out of 5 ran
> https://it-service.zae-bayern.de/Team/index.php/s/apra2zS7FHdqDQy
>
> New patch:
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> https://it-service.zae-bayern.de/Team/index.php/s/jAD52jBgNddrS3w
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> https://it-service.zae-bayern.de/Team/index.php/s/bcRjtz7r9NekzKB
>  -ntmpi 1 -ntomp 22 -pin on -pinstride 4:none ran*
> https://it-service.zae-bayern.de/Team/index.php/s/b3zp8DNztjE6ssF
>

This still doesn't tell much more unfortunately.

Two more things to try (can be combined)
- please set build with setting first
cmake . -DCMAKE_BUILD_TYPE=RelWithAssert
this may give us some extra debugging information during runs
- please use this patch now -- it will print some additional stuff to the
standard error output so please grab that and share it:
https://termbin.com/zq4q
(you can redirect the output e.g. by gmx mdrun > mdrun.out 2>&1)
- try running (with the above binary build + patch) the above failing case
repeasted a few times:
  -nsteps 0 -nb gpu -pme cpu -bonded cpu
  -nsteps 0 -nb gpu -pme gpu -bonded cpu
  -nsteps 0 -nb gpu -pme gpu -bonded gpu



> Regarding the Regressiontest:
>
> Sorry I didn't get it at the first time.
> If the md.log files are enough here is a folder for the failed parts of
> the complex regression test:
> https://it-service.zae-bayern.de/Team/index.php/s/64KAQBgNoPm4rJ2
>
> If you need any other files or the full directories please let me know.
>

Hmmm, looks like there are more issues here, some log files look truncated
others indicate termination by LINCS errors. Yes, the mdrun.out and
checkpot* files would be useful. How about just making a tarball of the
whole complex directory and sharing that?

Hopefully these tests will shed some light on what the issue is.

Cheers,
--
Szilard

Again, a lot of thank for your support.


> Best wishes,
> Steffi
>
>
>
>
>
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se] Im Auftrag von Szilárd
> Páll
> Gesendet: Dienstag, 26. März 2019 16:57
> An: Discussion list for GROMACS users
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi Steffi,
>
> Thanks for running the tests; yes, the patch file was meant to be applied
> to the unchanged GROMACS 2019 code.
>
> Please also share the log files from thr failed runs, not just the
> copy-paste of the fatal error -- as a result of the additional check there
> might have been a note printed which I was after.
>
> Regarding the regression tests, what I would like to have is the actual
> directories of the tests that failed, i.e. as your log indicates a few of
> the complex tests at least.
>
> Cheers,
> --
> Szilárd
>
> On Tue, Mar 26, 2019 at 1:44 PM Tafelmeier, Stefanie <
> stefanie.tafelme...@zae-bayern.de> wrote:
>
> > Hi Szilárd,
> >
> > thanks again for your answer.
> > Regarding the tests:
> > without the new patch:
> >
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 1:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 2:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 4:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 8:all ran
> > and
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 4:one out of 5 ran
> >
> >
> > With the new patch (devicebuffer.cuh had to be the original, right? The
> > already patched didn't work as the lines didn't fit, as far as I
> > understood.):
> >
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 1:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 2:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 4:all ran
> > -ntmpi 1 -ntomp 11 -pin on -pinstride 8:all ran
> > and
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> > -ntmpi 1 -ntomp 22 -pin on -pinstride 4:none ran*
> >
> 

Re: [gmx-users] WG: WG: Issue with CUDA and gromacs

2019-03-26 Thread Szilárd Páll
Hi Steffi,

Thanks for running the tests; yes, the patch file was meant to be applied
to the unchanged GROMACS 2019 code.

Please also share the log files from thr failed runs, not just the
copy-paste of the fatal error -- as a result of the additional check there
might have been a note printed which I was after.

Regarding the regression tests, what I would like to have is the actual
directories of the tests that failed, i.e. as your log indicates a few of
the complex tests at least.

Cheers,
--
Szilárd

On Tue, Mar 26, 2019 at 1:44 PM Tafelmeier, Stefanie <
stefanie.tafelme...@zae-bayern.de> wrote:

> Hi Szilárd,
>
> thanks again for your answer.
> Regarding the tests:
> without the new patch:
>
> -ntmpi 1 -ntomp 11 -pin on -pinstride 1:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 2:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 4:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 8:all ran
> and
> -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> -ntmpi 1 -ntomp 22 -pin on -pinstride 4:one out of 5 ran
>
>
> With the new patch (devicebuffer.cuh had to be the original, right? The
> already patched didn't work as the lines didn't fit, as far as I
> understood.):
>
> -ntmpi 1 -ntomp 11 -pin on -pinstride 1:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 2:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 4:all ran
> -ntmpi 1 -ntomp 11 -pin on -pinstride 8:all ran
> and
> -ntmpi 1 -ntomp 22 -pin on -pinstride 1:none ran*
> -ntmpi 1 -ntomp 22 -pin on -pinstride 2:none ran*
> -ntmpi 1 -ntomp 22 -pin on -pinstride 4:none ran*
>
> * Fatal error:
> Asynchronous H2D copy failed: invalid argument
>
>
> Regarding the regressiontest:
> The LastTest.log is available here:
> https://it-service.zae-bayern.de/Team/index.php/s/3sdki7Cf2x2CEQi
> this was not given in the log:
> The following tests FAILED:
>  42 - regressiontests/complex (Timeout)
>  46 - regressiontests/essentialdynamics (Failed)
> Errors while running CTest
> CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
> 'CMakeFiles/run-ctest-nophys' failed
> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
> CMakeFiles/Makefile2:1397: recipe for target
> 'CMakeFiles/run-ctest-nophys.dir/all'failed
> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
> CMakeFiles/Makefile2:1177: recipe for target
> 'CMakeFiles/check.dir/rule' failed
> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> Makefile:626: recipe for target 'check' failed
> make: *** [check] Error 2
>
> Many thanks again.
> Best wishes,
> Steffi
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: gromacs.org_gmx-users-boun...@maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-boun...@maillist.sys.kth.se] Im Auftrag von Szilárd
> Páll
> Gesendet: Montag, 25. März 2019 20:13
> An: Discussion list for GROMACS users
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi,
>
>
>
> --
> Szilárd
>
>
> On Mon, Mar 18, 2019 at 2:34 PM Tafelmeier, Stefanie <
> stefanie.tafelme...@zae-bayern.de> wrote:
>
> > Hi,
> >
> > Many thanks again.
> >
> > Regarding the tests:
> > - ntmpi 1 -ntomp 22 -pin on
> > >OK, so this suggests that your previously successful 22-thread runs did
> > not
> > turn on pinning, I assume?
> > It seems so, yet it does not run successfully each time. But if done with
> > 20-threads, which works usually without error, it does not look like the
> > pinning is turned on.
> >
>
> Pinning is only turned on if mdrun can safely assume that the cores of the
> node are not shared by multiple applications. This assumption can only be
> made if all hardware threads of the entire node are used the run itself
> (i.e. in your case 2x22 cores with HyperThreadince hence 2 threads each =
> 88 threads).
>
> -ntmpi 1 -ntomp 1 -pin on; runs
> > -ntmpi 1 -ntomp 2 -pin on; runs
> >
> > - ntmpi 24 -ntomp 1 -pinstride 1 -pin on; runs
> > - ntmpi 24 -ntomp 1 -pinstride 2 -pin on; runs
> >
> > After patch supplied:
> > - ntmpi 1 -ntomp 22 -pin on; sometime runs - sometimes doesn't*   ->
> > md_run.log at :
> > https://it-service.zae-bayern.de/Team/index.php/s/ezXWnQ2pGNeFx6T
> >
> >  md_norun.log at:
> > https://it-service.zae-bayern.de/Team/index.php/s/wYPY7dWEJdwmqJi
> > - ntmpi 1 -ntomp 22 -pin off; sometime runs - sometimes doesn't*   (ran
> > before)
> > - ntmpi 1 -ntomp

  1   2   3   4   5   6   7   >