Re: mdrun on 8-core AMD + GTX TITAN (was: Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs)

2013-11-12 Thread Szilárd Páll
As Mark said, please share the *entire* log file. Among other important things, the result of PP-PME tuning is not included above. However, I suspect that in this case scaling is strongly affected or by the small size of the system you are simulating. -- Szilárd On Sun, Nov 10, 2013 at 5:28 AM,

mdrun on 8-core AMD + GTX TITAN (was: Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs)

2013-11-07 Thread Szilárd Páll
Let's not hijack James' thread as your hardware is different from his. On Tue, Nov 5, 2013 at 11:00 PM, Dwey Kauffman mpi...@gmail.com wrote: Hi Szilard, Thanks for your suggestions. I am indeed aware of this page. In a 8-core AMD with 1GPU, I am very happy about its performance. See

Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs

2013-11-07 Thread Szilárd Páll
On Thu, Nov 7, 2013 at 6:34 AM, James Starlight jmsstarli...@gmail.com wrote: I've gone to conclusion that simulation with 1 or 2 GPU simultaneously gave me the same performance mdrun -ntmpi 2 -ntomp 6 -gpu_id 01 -v -deffnm md_CaM_test, mdrun -ntmpi 2 -ntomp 6 -gpu_id 0 -v -deffnm

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-05 Thread Szilárd Páll
Timo, Have you used the default settings, that is one rank/GPU? If that is the case, you may want to try using multiple ranks per GPU, this can often help when you have 4-6 cores/GPU. Separate PME ranks are not switched on by default with GPUs, have you tried using any? Cheers, -- Szilárd Páll

Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs

2013-11-05 Thread Szilárd Páll
detected: #0: NVIDIA GeForce GTX TITAN, compute cap.: 3.5, ECC: no, stat: compatible #1: NVIDIA GeForce GTX TITAN, compute cap.: 3.5, ECC: no, stat: compatible James 2013/11/4 Szilárd Páll pall.szil...@gmail.com You can use the -march=native flag with gcc to optimize

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-05 Thread Szilárd Páll
On Tue, Nov 5, 2013 at 9:55 PM, Dwey Kauffman mpi...@gmail.com wrote: Hi Timo, Can you provide a benchmark with 1 Xeon E5-2680 with 1 Nvidia k20x GPGPU on the same test of 29420 atoms ? Are these two GPU cards (within the same node) connected by a SLI (Scalable Link Interface) ?

Re: [gmx-users] Gromacs-4.6 on two Titans GPUs

2013-11-04 Thread Szilárd Páll
That should be enough. You may want to use the -march (or equivalent) compiler flag for CPU optimization. Cheers, -- Szilárd Páll On Sun, Nov 3, 2013 at 10:01 AM, James Starlight jmsstarli...@gmail.com wrote: Dear Gromacs Users! I'd like to compile lattest 4.6 Gromacs with native GPU

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-04 Thread Szilárd Páll
Brad, These numbers seems rather low for a standard simulation setup! Did you use a particularly long cut-off or short time-step? Cheers, -- Szilárd Páll On Fri, Nov 1, 2013 at 6:30 PM, Brad Van Oosten bv0...@brocku.ca wrote: Im not sure on the prices of these systems any more

Re: [gmx-users] Hardware for best gromacs performance?

2013-11-04 Thread Szilárd Páll
before buying. (Note that I have never tried it myself, so I can't provide more details or vouch for it in any way.) Cheers, -- Szilárd Páll On Fri, Nov 1, 2013 at 3:08 AM, David Chalmers david.chalm...@monash.edu wrote: Hi All, I am considering setting up a small cluster to run Gromacs jobs

Re: [gmx-users] Gromacs-4.6 on two Titans GPUs

2013-11-04 Thread Szilárd Páll
You can use the -march=native flag with gcc to optimize for the CPU your are building on or e.g. -march=corei7-avx-i for Intel Ivy Bridge CPUs. -- Szilárd Páll On Mon, Nov 4, 2013 at 12:37 PM, James Starlight jmsstarli...@gmail.com wrote: Szilárd, thanks for suggestion! What kind of CPU

Re: [gmx-users] Output pinning for mdrun

2013-10-24 Thread Szilárd Páll
Hi Carsten, On Thu, Oct 24, 2013 at 4:52 PM, Carsten Kutzner ckut...@gwdg.de wrote: On Oct 24, 2013, at 4:25 PM, Mark Abraham mark.j.abra...@gmail.com wrote: Hi, No. mdrun reports the stride with which it moves over the logical cores reported by the OS, setting the affinity of GROMACS

Re: [gmx-users] Gromacs on Stampede

2013-10-10 Thread Szilárd Páll
there are a few analysis tools that support OpenMP and even with those I/O will be a severe bottleneck if you were considering using the Phi-s for analysis. So for now, I would stick to using only the CPUs in the system. Cheers, -- Szilárd Páll On Thu, Oct 10, 2013 at 12:58 PM, Arun Sharma arunsharma_

Re: [gmx-users] confusion about implicint solvent

2013-09-23 Thread Szilárd Páll
Hi, Admittedly, both the documentation on these features and the communication on the known issues with these aspects of GROMACS has been lacking. Here's a brief summary/explanation: - GROMACS 4.5: implicit solvent simulations possible using mdrun-gpu which is essentially mdrun + OpenMM, hence

Re: [gmx-users] Cross compiling GROMACS 4.6.3 for native Xeon Phi, thread-mpi problem

2013-09-16 Thread Szilárd Páll
On Mon, Sep 16, 2013 at 7:04 PM, PaulC paul.cah...@uk.fujitsu.com wrote: Hi, I'm attempting to build GROMACS 4.6.3 to run entirely within a single Xeon Phi (i.e. native) with either/both Intel MPI/OpenMP for parallelisation within the single Xeon Phi. I followed these instructions from

Re: [gmx-users] segfault on Gromacs 4.6.3 (cuda)

2013-09-15 Thread Szilárd Páll
, 2013 at 4:35 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: HI, First of all, icc 11 is not well tested and there have been reports about it compiling broken code. This could explain the crash, but you'd need to do a bit more testing to confirm. Regading the GPU detection error, if you use

Re: [gmx-users] Installation of gromacs 4.5.4 on windows using cygwin

2013-09-15 Thread Szilárd Páll
Looks like you are compiling 4.5.1. You should try compiling the latest version in the 4.5 series, 4.5.7. -- Szilárd On Sun, Sep 15, 2013 at 6:39 PM, Muthukumaran R kuma...@bicpu.edu.in wrote: hello, I am trying to install gromacs in cygwin but after issuing make, installation stops with the

Re: [gmx-users] Re: Gromacs: GPU detection

2013-09-13 Thread Szilárd Páll
FYI, I've file a bug report which you can track if interested: http://redmine.gromacs.org/issues/1334 -- Szilárd On Sun, Sep 1, 2013 at 9:49 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: I may have just come across this issue as well. I have no time to investigate, but my guess is that it's

Re: [gmx-users] segfault on Gromacs 4.6.3 (cuda)

2013-09-09 Thread Szilárd Páll
HI, First of all, icc 11 is not well tested and there have been reports about it compiling broken code. This could explain the crash, but you'd need to do a bit more testing to confirm. Regading the GPU detection error, if you use a driver which is incompatible with the CUDA runtime (at least as

Re: [gmx-users] gromacs 4.6.3 and Intel compiiler 11.x

2013-09-03 Thread Szilárd Páll
On Tue, Sep 3, 2013 at 9:50 PM, Guanglei Cui amber.mail.arch...@gmail.com wrote: Hi Mark, I agree with you and Justin, but let's just say there are things that are out of my control ;-) I just tried SSE2 and NONE. Both failed the regression check. That's alarming, with

Re: [gmx-users] Long range Lennard Jones

2013-09-02 Thread Szilárd Páll
On Thu, Aug 29, 2013 at 7:18 AM, Gianluca Interlandi gianl...@u.washington.edu wrote: Justin, I respect your opinion on this. However, in the paper indicated below by BR Brooks they used a cutoff of 10 A on LJ when testing IPS in CHARMM: Title: Pressure-based long-range correction for

Re: [gmx-users] Re: Gromacs: GPU detection

2013-09-01 Thread Szilárd Páll
I may have just come across this issue as well. I have no time to investigate, but my guess is that it's related to some thread-safety issue with thread-MPI. Could one of you please file a bug report on redmine.gromacs.org? Cheers, -- Szilárd On Thu, Aug 8, 2013 at 5:52 PM, Brad Van Oosten

Re: [gmx-users] Gromacs: GPU detection

2013-08-07 Thread Szilárd Páll
That should never happen. If mdrun is compiled with GPU support and GPUs are detected, the detection stats should always get printed. Can you reliably reproduce the issue? -- Szilárd On Fri, Aug 2, 2013 at 9:50 AM, Jernej Zidar jernej.zi...@gmail.com wrote: Hi there. Lately I've been

Re: [gmx-users] Re: CUDA with QUADRO GPUs

2013-08-01 Thread Szilárd Páll
Dear Ramon, Thanks for the kind words! On Tue, Jun 18, 2013 at 10:22 AM, Ramon Crehuet Simon ramon.creh...@iqac.csic.es wrote: Dear Szilard, Thanks for your message. Your help is priceless and helps advance science more than many publications. I extend that to many experts who kindly and

Re: [gmx-users] Intel vs gcc compilers

2013-08-01 Thread Szilárd Páll
Hi, The Intel compilers are only recommended for pre-Bulldozer AMD processors (K10: Magny-Cours, Intanbul, Barcelona, etc.). On these, PME non-bonded kernels (not the RF or plain cut-off!) are 10-30% slower with gcc than with icc. The icc-gcc difference is the smallest with gcc 4.7, typically

Re: Re: Re: [gmx-users] GPU-based workstation

2013-08-01 Thread Szilárd Páll
, 28. Mai 2013 um 19:50 Uhr *Von:* Szilárd Páll szilard.p...@cbr.su.se *An:* Discussion list for GROMACS users gmx-users@gromacs.org *Betreff:* Re: Re: [gmx-users] GPU-based workstation Dear all, As far as I understand, the OP is interested in hardware for *running* GROMACS 4.6 rather than

Re: [gmx-users] GROMACS 4.6.3 Static Linking

2013-07-31 Thread Szilárd Páll
On Thu, Jul 25, 2013 at 5:55 PM, Mark Abraham mark.j.abra...@gmail.com wrote: That combo is supposed to generate a CMake warning. I also get a warning during linking that some shared library will have to provide some function (getpwuid?) at run time, but the binary is static. That warning

Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

2013-07-31 Thread Szilárd Páll
On Fri, Jul 19, 2013 at 6:59 PM, gigo g...@ibb.waw.pl wrote: Hi! On 2013-07-17 21:08, Mark Abraham wrote: You tried ppn3 (with and without --loadbalance)? I was testing on 8-replicas simulation. 1) Without --loadbalance and -np 8. Excerpts from the script: #PBS -l nodes=8:ppn=3

Re: [gmx-users] Running GROMACS on mini GPU cluster

2013-07-29 Thread Szilárd Páll
The message is perfectly normal. When you do not use all available cores/hardware threads (seen as CPUs by the OS), to avoid potential clashes, mdrun does not pin threads (i.e. it lets the OS migrate threads). On NUMA systems (most multi-CPU machines), this will cause performance degradation as

Re: [gmx-users] Multi-level parallelization: MPI + OpenMP

2013-07-19 Thread Szilárd Páll
Depending on the level of parallelization (number of nodes and number of particles/core) you may want to try: - 2 ranks/node: 8 cores + 1 GPU, no separate PME (default): mpirun -np 2*Nnodes mdrun_mpi [-gpu_id 01 -npme 0] - 4 ranks per node: 4 cores + 1 GPU (shared between two ranks), no

Re: [gmx-users] 4.6.3 and MKL

2013-07-11 Thread Szilárd Páll
FYI: The MKL FFT has been shown to be up to 30%+ slower than FFTW 3.3. -- Szilárd On Thu, Jul 11, 2013 at 1:17 AM, Éric Germaneau german...@zoho.com wrote: I have the same feeling too but I'm not in charge of it unfortunately. Thank you, I appreciate. On 07/11/2013 07:15 AM, Mark Abraham

Re: [gmx-users] Problem with running REMD in Gromacs 4.6.3

2013-07-10 Thread Szilárd Páll
Hi, Is affinity setting (pinning) on? What compiler are you using? There are some known issues with Intel OpenMP getting in the way of the internal affinity setting. To verify whether this is causing a problem, try turning of pinning (-pin off). Cheers, -- Szilárd On Tue, Jul 9, 2013 at 5:29

Re: [gmx-users] FW: Inconsistent results between 3.3.3 and 4.6 with various set-up options

2013-07-10 Thread Szilárd Páll
Just a note regarding the performance issues mentioned. You are using reaction-field electrostatics case in which by default there is very little force workload left for the CPU (only the bondeds) and therefore the CPU idles most of the time. To improve performance, use -nb gpu_cpu with multiple

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll
PS: the error message is referring the to *driver* version, not the CUDA toolkit/runtime version. -- Szilárd On Tue, Jul 9, 2013 at 11:15 AM, Szilárd Páll szilard.p...@cbr.su.se wrote: Tesla C1060 is not compatible - which should be shown in the log and standard output. Cheers, -- Szilárd

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll
Tesla C1060 is not compatible - which should be shown in the log and standard output. Cheers, -- Szilárd On Tue, Jul 9, 2013 at 10:54 AM, Albert mailmd2...@gmail.com wrote: Dear: I've installed a gromacs-4.6.3 in a GPU cluster, and I obtained the following information for testing: NOTE:

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll
On Tue, Jul 9, 2013 at 11:20 AM, Albert mailmd2...@gmail.com wrote: On 07/09/2013 11:15 AM, Szilárd Páll wrote: Tesla C1060 is not compatible - which should be shown in the log and standard output. Cheers, -- Szilárd THX for kind comments. do you mean C1060 is not compatible with cuda

Re: [gmx-users] fftw compile error for 4.6.2

2013-07-04 Thread Szilárd Páll
FYI: 4.6.2 contains a bug related to thread affinity setting which will lead to a considerable performance loss (I;ve seen 35%) as well as often inconsistent performance - especially with GPUs (case in which one would run many OpenMP threads/rank). My advice is that you either use the code from

Re: [gmx-users] Installation on Ubuntu 12.04LTS

2013-07-04 Thread Szilárd Páll
. From: Szilárd Páll szilard.p...@cbr.su.se To: Mare Libero marelibe...@yahoo.com; Discussion list for GROMACS users gmx-users@gromacs.org Sent: Thursday, June 27, 2013 10:47 AM Subject: Re: [gmx-users] Installation on Ubuntu 12.04LTS On Thu, Jun 27, 2013 at 12:57 PM

Re: [gmx-users] Gromacs GPU system question

2013-07-04 Thread Szilárd Páll
On Mon, Jun 24, 2013 at 4:43 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: On Sat, Jun 22, 2013 at 5:55 PM, Mirco Wahab mirco.wa...@chemie.tu-freiberg.de wrote: On 22.06.2013 17:31, Mare Libero wrote: I am assembling a GPU workstation to run MD simulations, and I was wondering if anyone has

Re: [gmx-users] Installation on Ubuntu 12.04LTS

2013-06-27 Thread Szilárd Páll
On Thu, Jun 27, 2013 at 12:57 PM, Mare Libero marelibe...@yahoo.com wrote: Hello everybody, Does anyone have any recommendation regarding the installation of gromacs 4.6 on Ubuntu 12.04? I have the nvidia-cuda-toolkit that comes in synaptic (4.0.17-3ubuntu0.1 installed in

Re: [gmx-users] Gromacs GPU system question

2013-06-26 Thread Szilárd Páll
Thanks Mirco, good info, your numbers look quite consistent. The only complicating factor is that your CPUs are overclocked by different amounts, which changes the relative performances somewhat compared to non-overclocked parts. However, let me list some prices to show that the top-of-the line

Re: [gmx-users] Gromacs GPU system question

2013-06-24 Thread Szilárd Páll
I strongly suggest that you consider the single-chip GTX cards instead of a dual-chip one; from the point of view of price/performance you'll probably get the most from a 680 or 780. You could ask why, so here are the reasons: - The current parallelization scheme requires domain-decomposition to

Re: [gmx-users] Gromacs GPU system question

2013-06-24 Thread Szilárd Páll
On Sat, Jun 22, 2013 at 5:55 PM, Mirco Wahab mirco.wa...@chemie.tu-freiberg.de wrote: On 22.06.2013 17:31, Mare Libero wrote: I am assembling a GPU workstation to run MD simulations, and I was wondering if anyone has any recommendation regarding the GPU/CPU combination. From what I can see,

Re: [gmx-users] TPI Results differ in v4.5.7 and v4.6.1

2013-06-24 Thread Szilárd Páll
If you have a solid example that reproduced the problem, feel free to file an issue on redmine.gromacs.org ASAP. Briefly documenting your experiments and verification process on the issue report page can help help developers in giving you faster feedback as well as with accepting the report as a

Re: [gmx-users] CUDA with QUADRO GPUs?

2013-06-17 Thread Szilárd Páll
Dear Ramon, Compute capability does not reflect the performance of a card, but it is an indicator of what functionalities does the GPU provide - more like a generation number or feature set version. Quadro cards are typically quite close in performance/$ to Teslas with roughly 5-8x *lower*

Re: [gmx-users] Re: mdrun segmentation fault for new build of gromacs 4.6.1

2013-06-11 Thread Szilárd Páll
-funroll-all-loops -fexcess-precision=fast -O3 -DNDEBUG All the regressiontests failed. So it appears that, at least for my system, I need to include the directives to not use the external BLAS/LAPACK. Amil On Jun 10, 2013, at 12:12 PM, Szilárd Páll [via GROMACS] wrote: Amil, It looks like

Re: [gmx-users] Re: mdrun segmentation fault for new build of gromacs 4.6.1

2013-06-10 Thread Szilárd Páll
Amil, It looks like there is a mixup in your software configuration and mdrun is linked against libguide.so, the OpenMP library part of the Intel compiler v11 which gets loaded early and is probably causing the crash. This library was probably pulled in implicitly by MKL which the build system

Re: [gmx-users] GPU ECC question

2013-06-09 Thread Szilárd Páll
On Sat, Jun 8, 2013 at 9:21 PM, Albert mailmd2...@gmail.com wrote: Hello: Recently I found a strange question about Gromacs-4.6.2 on GPU workstaion. In my GTX690 machine, when I run md production I found that the ECC is on. However, in my another GTX590 machine, I found the ECC was off: 4

Re: [gmx-users] Running gmx-4.6.x over multiple homogeneous nodes with GPU acceleration

2013-06-09 Thread Szilárd Páll
On Wed, Jun 5, 2013 at 4:35 PM, João Henriques joao.henriques.32...@gmail.com wrote: Just to wrap up this thread, it does work when the mpirun is properly configured. I knew it had to be my fault :) Something like this works like a charm: mpirun -npernode 2 mdrun_mpi -ntomp 8 -gpu_id 01

Re: [gmx-users] Running gmx-4.6.x over multiple homogeneous nodes with GPU acceleration

2013-06-04 Thread Szilárd Páll
mdrun is not blind, just the current design does report the hardware of all compute nodes used. Whatever CPU/GPU hardware mdrun reports in the log/std output is *only* what rank 0, i.e. the first MPI process, detects. If you have a heterogeneous hardware configuration, in most cases you should be

Re: [gmx-users] GPU problem

2013-06-04 Thread Szilárd Páll
-nt is mostly a backward compatibility option and sets the total number of threads (per rank). Instead, you should set both -ntmpi (or -np with MPI) and -ntomp. However, note that unless a single mdrun uses *all* cores/hardware threads on a node, it won't pin the threads to cores. Failing to pin

Re: [gmx-users] problems with GROMACS 4.6.2

2013-06-04 Thread Szilárd Páll
Just a few minor details: - You can set the affinities yourself through the job scheduler which should give nearly identical results compared to the mdrun internal affinity if you simply assign cores to mdrun threads in a sequential order (or with an #physical cores stride if you want to use

Re: [gmx-users] gmx 4.6.2 segementation fault (core dump)

2013-06-03 Thread Szilárd Páll
Thanks for reporting this. he best would be a redmine bug with a tpr, command line invocation for reproduction as well log output to see what software and hardware configuration are you using. Cheers, -- Szilárd On Mon, Jun 3, 2013 at 2:46 PM, Johannes Wagner johannes.wag...@h-its.org wrote:

Re: [gmx-users] gmx 4.6.2 segementation fault (core dump)

2013-06-03 Thread Szilárd Páll
/ HRB 337446 Managing Directors: Dr. h.c. Klaus Tschira Prof. Dr.-Ing. Andreas Reuter On 03.06.2013, at 16:01, Szilárd Páll szilard.p...@cbr.su.se wrote: Thanks for reporting this. he best would be a redmine bug with a tpr, command line invocation for reproduction as well log output to see

Re: [gmx-users] How to compile/run Gromacs on native Infiniband?

2013-06-03 Thread Szilárd Páll
There's no ibverbs support, s o pick your favorite/best MPI implementation, more than that you can't do. -- Szilárd On Mon, Jun 3, 2013 at 2:54 PM, Bert bert.u...@gmail.com wrote: Dear all, My cluster has a FDR (56 Gb/s) Infiniband network. It is well known that there is a big difference

Re: [gmx-users] About Compilation error in gromacs 4.6

2013-05-28 Thread Szilárd Páll
10.04 comes with gcc 4.3 and 4.4 which should both work (we even test them with Jenkins). Still, you should really get a newer gcc, especially if you have an 8-core AMD CPU (= either Bulldozer or Piledriver) both of which are fully supported only by gcc 4.7 and later. Additionally, AFAIK the

Re: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll
Dear all, As far as I understand, the OP is interested in hardware for *running* GROMACS 4.6 rather than developing code. or running LINPACK. To get best performance it is important to use a machine with hardware balanced for GROMACS' workloads. Too little GPU resources will result in CPU

Re: Aw: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll
On Sat, May 25, 2013 at 2:16 PM, Broadbent, Richard richard.broadben...@imperial.ac.uk wrote: I've been running on my Universities GPU nodes these are one E5-xeon (6-cores 12 threads) and have 4 Nvidia 690gtx's. My system is 93 000 atoms of DMF under NVE. The performance has been a little

Re: [gmx-users] Re: GPU-based workstation

2013-05-28 Thread Szilárd Páll
On Tue, May 28, 2013 at 10:14 AM, James Starlight jmsstarli...@gmail.com wrote: I've found GTX Titat with 6gb of RAM and 384 bit. The price of such card is equal to the price of the latest TESLA cards. Nope! Titan: $1000 Tesla K10: $2750 Tesla K20(c): $3000 TITAN is cheaper than any Tesla and

Re: [gmx-users] Re: Have your ever got a real NVE simulation (good energy conservation) in gromacs?

2013-05-25 Thread Szilárd Páll
With the verlet cutoff scheme (new in 4.6) you get much better control over the drift caused by (missed) short range interactions; you just set a maximum allowed target drift and the buffer will be calculated accordingly. Additionally, with the verlet scheme you are free to tweak the neighbor

Re: [gmx-users] compile Gromacs using Cray compilers

2013-05-20 Thread Szilárd Páll
The thread-MPI library provides the thread affinity setting functionality to mdrun, hence certain parts of it will always be compiled in, even with GMX_MPI=ON. Apparently, the Cray compiler does not like some of the thread-MPI headers. Feel free to file a bug report on redmine.gromacs.org, but

Re: [gmx-users] Comparing Gromacs versions

2013-05-17 Thread Szilárd Páll
The answer is in the log files, in particular the performance summary should indicate where is the performance difference. If you post your log files somewhere we can probably give further tips on optimizing your run configurations. Note that with such a small system the scaling with the group

Re: [gmx-users] Comparing Gromacs versions

2013-05-17 Thread Szilárd Páll
On Fri, May 17, 2013 at 2:48 PM, Djurre de Jong-Bruinink djurredej...@yahoo.com wrote: The answer is in the log files, in particular the performance summary should indicate where is the performance difference. If you post your log files somewhere we can probably give further tips on optimizing

Re: [gmx-users] Performance (GMX4.6.1): MPI vs Threads

2013-05-16 Thread Szilárd Páll
I'm not sure what you mean by threads. In GROMACS this can refer to either thread-MPI or OpenMP multi-threading. To run within a single compute node a default GROMACS installation using either of the two aforementioned parallelization methods (or a combination of the two) can be used. -- Szilárd

Re: [gmx-users] Performance (GMX4.6.1): MPI vs Threads

2013-05-16 Thread Szilárd Páll
PS: if your compute-nodes are Intel of some recent architecture OpenMP-only parallelization can be considerably more efficient. For more details see http://www.gromacs.org/Documentation/Acceleration_and_parallelization -- Szilárd On Thu, May 16, 2013 at 7:26 PM, Szilárd Páll szilard.p

Re: [gmx-users] cudaStreamSynchronize failed

2013-05-10 Thread Szilárd Páll
Hi, Such an issue typically indicates a GPU kernel crash. This can be caused by a large variety of factors from program bug to GPU hardware problem. To do a simple check for the former please run with the CUDA memory checker, e.g: /usr/local/cuda/bin/cuda-memcheck mdrun [...] Additionally, as

Re: [gmx-users] Re: Illegal instruction (core dumped) - trjconv

2013-04-29 Thread Szilárd Páll
This error means that your binaries contain machine instructions that the processor you run them on does not support. The most probable cause is that you compiled the binaries on a machine with different architecture than the one you are running on. Cheers, -- Szilárd On Mon, Apr 29, 2013 at

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
Have you tried running on CPUs only just to see if the issue persists? Unless the issue does not occur with the same binary on the same hardware running on CPUs only, I doubt it's a problem in the code. Do you have ECC on? -- Szilárd On Sun, Apr 28, 2013 at 5:27 PM, Albert mailmd2...@gmail.com

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
On Mon, Apr 29, 2013 at 2:41 PM, Albert mailmd2...@gmail.com wrote: On 04/28/2013 05:45 PM, Justin Lemkul wrote: Frequent failures suggest instability in the simulated system. Check your .log file or stderr for informative Gromacs diagnostic information. -Justin my log file didn't have

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
while mdrun was running? Cheers, -- Szilárd On Mon, Apr 29, 2013 at 3:32 PM, Albert mailmd2...@gmail.com wrote: On 04/29/2013 03:31 PM, Szilárd Páll wrote: The segv indicates that mdrun crashed and not that the machine was restarted. The GPU detection output (both on stderr and log) should

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
On Mon, Apr 29, 2013 at 3:51 PM, Albert mailmd2...@gmail.com wrote: On 04/29/2013 03:47 PM, Szilárd Páll wrote: In that case, while it isn't very likely, the issue could be caused by some implementation detail which aims to avoid performance loss caused by an issue in the NVIDIA drivers

Re: [gmx-users] compile error

2013-04-26 Thread Szilárd Páll
You got a warning at configure-time that the nvcc host compiler can't be set because the mpi compiler wrapper are used. Because of this, nvcc is using gcc to compile CPU code whick chokes on the icc flags. You can: - set CUDA_HOST_COMPILER to the mpicc backend, i.e. icc or - let cmake detect MPI

Re: [gmx-users] How to use multiple nodes, each with 2 CPUs and 3 GPUs

2013-04-25 Thread Szilárd Páll
Hi, You should really check out the documentation on how to use mdrun 4.6: http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Running_simulations Brief summary: when running on GPUs every domain is assigned to a set of CPU cores and a GPU, hence you need to start as many PP

Re: [gmx-users] GROMACS 4.6 with GPU acceleration (double presion)

2013-04-22 Thread Szilárd Páll
On Tue, Apr 9, 2013 at 6:52 PM, David van der Spoel sp...@xray.bmc.uu.sewrote: On 2013-04-09 18:06, Mikhail Stukan wrote: Dear experts, I have the following question. I am trying to compile GROMACS 4.6.1 with GPU acceleration and have the following diagnostics: # cmake .. -DGMX_DOUBLE=ON

Re: [gmx-users] GROMACS 4.6 with GPU acceleration (double

2013-04-22 Thread Szilárd Páll
On Mon, Apr 22, 2013 at 8:49 AM, Albert mailmd2...@gmail.com wrote: On 04/22/2013 08:40 AM, Mikhail Stukan wrote: Could you explain which hardware do you mean? As far as I know, K20X supports double precision, so I would assume that double precision GROMACS should be realizable on it.

Re: [gmx-users] Error in make install no valid ELF RPATH. Cray XE6m

2013-04-20 Thread Szilárd Páll
Hi, Your problem will likely be solved by not writing the rpath to the binaries which can be accomplished by setting -DCMAKE_SKIP_RPATH=OFF. This will mean that you will have to make sure that the library path is set for mdrun to work. If that does not fully solve the problem, you might have to

Re: [gmx-users] Building Single and Double Precision in 4.6.1?

2013-04-18 Thread Szilárd Páll
On Thu, Apr 18, 2013 at 6:17 PM, Mike Hanby mha...@uab.edu wrote: Thanks for the reply, so the next question, after I finish building single precision non parallel, is there an efficient way to kick off the double precision build, then the single precision mpi and so on? Or do I need to

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-13 Thread Szilárd Páll
On Sat, Apr 13, 2013 at 3:30 PM, Mirco Wahab mirco.wa...@chemie.tu-freiberg.de wrote: On 12.04.2013 20:20, Szilárd Páll wrote: On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 ra...@kaist.ac.kr wrote: Can cygwin recognize the CUDA installed in win 7? if so, how do i link them ? Good question, I've

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-13 Thread Szilárd Páll
On Sat, Apr 13, 2013 at 5:27 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: On Sat, Apr 13, 2013 at 3:30 PM, Mirco Wahab mirco.wa...@chemie.tu-freiberg.de wrote: On 12.04.2013 20:20, Szilárd Páll wrote: On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 ra...@kaist.ac.kr wrote: Can cygwin recognize

Re: [gmx-users] cygwin_mpi_gmx installation

2013-04-12 Thread Szilárd Páll
Indeed it's strange. In fact, it seems that CUDA detection did not even run, there should be a message whether it found the toolkit or not just before the Enabling native GPU acceleration - and the enabling should not even happen without CUDA detected. Unrelated, but do you really need MPI with

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-12 Thread Szilárd Páll
On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 ra...@kaist.ac.kr wrote: Thanks for your answers. I have uninstalled the mpi, have also reinstalled the CUDA and got the same issue. As you have mentioned before I noticed that it struggle to detect the CUDA. Do you mean that you reconfigured without

Re: [gmx-users] K20 test

2013-04-11 Thread Szilárd Páll
Hi, No, it just means that *your simulation* does not scale. The question is very vague, hence impossible to answer without more details However, assuming that you are not running a, say, 5000 atom system over 6 nodes, the most probable reason is that you have 6 Sandy Bridge nodes with 12-16

Re: [gmx-users] GPU performance

2013-04-10 Thread Szilárd Páll
On Wed, Apr 10, 2013 at 3:34 AM, Benjamin Bobay bgbo...@ncsu.edu wrote: Szilárd - First, many thanks for the reply. Second, I am glad that I am not crazy. Ok so based on your suggestions, I think I know what the problem is/was. There was a sander process running on 1 of the CPUs. Clearly

Re: [gmx-users] General conceptual question about advantage of GPUs

2013-04-10 Thread Szilárd Páll
Hi Andrew, As others have said, 40x speedup with GPUs is certainly possible, but more often than not comparisons leading to such numbers are not entirely fair - at least from a computational perspective. The most common case is when people compare legacy, poorly (SIMD)-optimized codes with some

Re: [gmx-users] About 4.6.1

2013-04-10 Thread Szilárd Páll
On Wed, Apr 10, 2013 at 4:48 PM, 陈照云 chenzhaoyu...@gmail.com wrote: I have tested gromacs-4.6.1 with k20. But when I run the mdrun, I met some problems. 1.GPU only support float accelerating? Yes. 2.Configure options are -DGMX_MPI ,-DGMX_DOUBLE . But if I run parallely with mpirun, it

Re: [gmx-users] help: load imbalance

2013-04-10 Thread Szilárd Páll
On Wed, Apr 10, 2013 at 4:50 PM, 申昊 shen...@mail.bnu.edu.cn wrote: Hello, I wanna ask some questions about load imbalance. 1 Here are the messages resulted from grompp -f md.mdp -p topol.top -c npt.gro -o md.tpr NOTE 1 [file md.mdp]: The optimal PME mesh load for parallel

Re: [gmx-users] General conceptual question about advantage of GPUs

2013-04-10 Thread Szilárd Páll
On Wed, Apr 10, 2013 at 4:24 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: Hi Andrew, As others have said, 40x speedup with GPUs is certainly possible, but more often than not comparisons leading to such numbers are not entirely fair - at least from a computational perspective. The most

Re: [gmx-users] GPU performance

2013-04-09 Thread Szilárd Páll
Hi Ben, That performance is not reasonable at all - neither for CPU only run on your quad-core Sandy Bridge, nor for the CPU+GPU run. For the latter you should be getting more like 50 ns/day or so. What's strange about your run is that the CPU-GPU load balancing is picking a *very* long cut-off

Re: [gmx-users] GROMACS 4.6v - Myrinet2000

2013-04-08 Thread Szilárd Páll
On Mon, Apr 8, 2013 at 1:37 PM, Justin Lemkul jalem...@vt.edu wrote: On Mon, Apr 8, 2013 at 2:28 AM, Hrachya Astsatryan hr...@sci.am wrote: Dear all, We have installed the latest version of Gromacs (version 4.6) on our cluster by the following step: * cmake .. -DGMX_MPI=ON

Re: [gmx-users] gmx 4.6 mpi installation through openmpi?

2013-04-05 Thread Szilárd Páll
Hi, As the error message states, the reason for the failed configuration is that CMake can't auto-detect MPI which is needed when you are not providing the MPI compiler wrapper as compiler. If you want to build with MPI you can either let CMake auto-detect MPI and just compile with the C

Re: [gmx-users] About the configuration of Gromacs on multiple nodes with GPU

2013-03-30 Thread Szilárd Páll
Hi, You can certainly use your hardware setup. I assume you've been looking at the log/console output based on which it might seem that mdrun is only using the GPUs in the first (=master) node. However, that is not the case, it's just that the current hardware and launch configuration reporting

Re: [gmx-users] no CUDA-capable device is detected

2013-03-28 Thread Szilárd Páll
Hi, If mdrun says that it could not detect GPUs it simply means that the GPU enumeration found no GPUs, otherwise it would have printed what was found. This is rather strange because mdrun uses the same mechanism the deviceQuery SDK example. I really don't have a good idea what could be the

Re: [gmx-users] no CUDA-capable device is detected

2013-03-28 Thread Szilárd Páll
On Thu, Mar 28, 2013 at 4:26 PM, Chandan Choudhury iitd...@gmail.com wrote: On Thu, Mar 28, 2013 at 4:09 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: Hi, If mdrun says that it could not detect GPUs it simply means that the GPU enumeration found no GPUs, otherwise it would have printed

Re: [gmx-users] Mismatching number of PP MPI processes and GPUs per node

2013-03-22 Thread Szilárd Páll
Hi, Actually, if you don't want to run across the network, with those Westmere processors you should be fine with running OpenMP across the two sockets, i.e mdrun -ntomp 24 or to run without HyperThreading (which can be sometimes faster) just use mdrun -ntomp 12 -pin on Now, when it comes to GPU

Re: [gmx-users] Installing GROMACS4.6.1 on Intel MIC

2013-03-21 Thread Szilárd Páll
FYI: As much as Intel likes to say that you can just run MPI/MPI+OpenMP code on MIC, you will probably not be impressed with the performance (it will be *much* slower than a Xeon CPU). If you want to know why and what/when are we doing something about it, please read my earlier comments on MIC

Re: [gmx-users] cuda gpu status on mdrun

2013-03-21 Thread Szilárd Páll
Hi Quentin, That's just a way of saying that something is wrong with either of the following (in order of possibility of the event): - your GPU driver is too old, hence incompatible with your CUDA version; - your GPU driver installation is broken; - your GPU is behaving in an unexpected/strange

Re: [gmx-users] Mismatching number of PP MPI processes and GPUs per node

2013-03-21 Thread Szilárd Páll
FYI: On your machine running OpenMP across two sockets will probably not be very efficient. Depending on the input and at how high paralleliation are you running, you could be better off with running multiple MPI ranks per GPU. This is a bit of an unexplained feature due to it being complicated to

Re: [gmx-users] Gromacs with Intel Xeon Phi coprocessors ?

2013-03-12 Thread Szilárd Páll
Hi Chris, You should be able to run on MIC/Xeon Phi as these accelerators, when used in symmetric mode, behave just like a compute node. However, for two main reasons the performance will be quite bad: - no SIMD accelerated kernels for MIC; - no accelerator-specific parallelization implemented

Re: [gmx-users] Performance of 4.6.1 vs. 4.5.5

2013-03-09 Thread Szilárd Páll
As Mark said, we need concrete details to answer the question: - log files (all four of them: 1/2 nodes, 4.5/4.6) - hardware (CPUs, network) - compilers The 4.6 log files contain much of the second and third point except the network. Note that you can compare the performance summary table's

Re: [gmx-users] Thread affinity setting failed

2013-03-08 Thread Szilárd Páll
that it's because it's almost entirely water and hence probably benefits from the Group scheme optimizations for water described on the Gromacs website. Thanks again for the explanation, Reid On Mon, Mar 4, 2013 at 3:45 PM, Szilárd Páll szilard.p...@cbr.su.se wrote: Hi

Re: [gmx-users] Re: [gmx-developers] Gromacs 4.6.1 ATLAS/CUDA detection problems...

2013-03-07 Thread Szilárd Páll
On Thu, Mar 7, 2013 at 2:02 PM, Berk Hess g...@hotmail.com wrote: Hi, This was only a note, not a fix. I was just trying to say that what linear algebra library you use for Gromacs is irrelevant in more than 99% of the cases. But having said that, the choice of library should not

  1   2   3   >