Hi Szilárd and Roland, Thanks for the clear explanation!
I will compile release-4.6 (instead of the nbnxn_hybrid_acc branch) and do some further testing in a few weeks since I'm currently using the machine for production-runs with gmx-4.5.5. Thanks for your time and effort! regards, raf On Thu, 2012-11-22 at 00:23 +0100, Szilárd Páll wrote: > Roland, > > > He explicitly stated that he is using 20da718 which is also from the > nbnxn_hybrid_acc branch. > > > Raf, as Roland said, get the release-4-6 ad try again! > > > > > There's an important thing to mention: your hardware configuration is > probably quite imbalanced and the default settings are certainly not > the best to run with: two MPI processes/threads with 24 OpenMP threads > + a GPU each. GROMACS works best with balanced hardware configuration > and yours is certainly not balanced, the GPUs will not be able to keep > up with 64 CPU cores. > > > Regarding the run configuration most importantly, in most cases you > should avoid running a group of OpenMP threads across sockets (except > on Intel, <=12-16 threads). On these Opterons running OpenMP at most > on a half CPU is recommended (the CPUs are in reality two CPU dies > bolted together) and in fact you might be better off with even less > threads per MPI process/thread. This means that multiple processes > will have to share a GPU which is not optimal and work only with MPI > in the current version. > > > So to conclude, to get the best performance you should try a few > combinations: > > > # process 0,1 will use GPU0, process 2,3 GPU1 > > # this avoids running across sockets, but for aforementioned reasons > it will still be suboptimal > mpirun -np 4 mdrun_mpi -gpu_id 0011 > > > # process 0,1,2,3 will use GPU0, process 4,5,6,7 GPU1 > > # this config will probably still be slower than the next one > mpirun -np 8 mdrun_mpi -gpu_id 000011111 > > > # process 0,1,2,3,4,5,6,7 will use GPU0, process 8,9,10,11,12,13,14,15 > GPU1 > > # this config will probably still be slower than the next one > mpirun -np 16 mdrun_mpi -gpu_id 00000000111111111 > > > You should go ahead and try with 32 and 64 processes as well, I > suspect that 2 or 3 threads/process will be the fastest. Depending on > what system you are simulating, this could lead to load imbalance, but > that you'll have to see. > > > If it turns out that the "Wait for GPU" time is more than a few > percent (which will probably be the case), it means that a GTX 580 is > not fast enough for two of these Opterons. What you can try is to run > using the "hybrid" mode with "-nb gpu_cpu" which might help. > > > > Cheers, > > -- > Szilárd > > > On Sat, Nov 17, 2012 at 3:11 AM, Roland Schulz <[email protected]> wrote: > Hi Raf, > > which version of Gromacs did you use? If you used branch > nbnxn_hybrid_acc > please use branch release-4-6 instead and see whether that > fixes your > issue. If not please open a bug and upload your log file and > your tpr. > > Roland > > > On Thu, Nov 15, 2012 at 5:13 PM, Raf Ponsaerts < > [email protected]> wrote: > > > Hi Szilárd, > > > > I assume I get the same segmentation fault error as > Sebastian (don't > > shoot if not so). I have 2 NVIDA GTX580 cards (and 4x12-core > amd64 > > opteron 6174). > > > > in brief : > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fffc07f8700 (LWP 32035)] > > 0x00007ffff61de301 in nbnxn_make_pairlist.omp_fn.2 () > > from /usr/local/gromacs/bin/../lib/libmd.so.6 > > > > Also -nb cpu with Verlet cutoff-scheme results in this > error... > > > > gcc 4.4.5 (Debian 4.4.5-8), Linux kernel 3.1.1 > > CMake 2.8.7 > > > > If I attach the mdrun.debug output file to this mail, the > mail to the > > list gets bounced by the mailserver (because mdrun.debug > > 50 Kb). > > > > Hoping this might help, > > > > regards, > > > > raf > > =========== > > compiled code : > > commit 20da7188b18722adcd53088ec30e5f256af62f20 > > Author: Szilard Pall <[email protected]> > > Date: Tue Oct 2 00:29:33 2012 +0200 > > > > =========== > > (gdb) exec mdrun > > (gdb) run -debug 1 -v -s test.tpr > > > > Reading file test.tpr, VERSION 4.6-dev-20121002-20da718 > (single > > precision) > > [New Thread 0x7ffff3844700 (LWP 31986)] > > [Thread 0x7ffff3844700 (LWP 31986) exited] > > [New Thread 0x7ffff3844700 (LWP 31987)] > > [Thread 0x7ffff3844700 (LWP 31987) exited] > > Changing nstlist from 10 to 50, rlist from 2 to 2.156 > > > > Starting 2 tMPI threads > > [New Thread 0x7ffff3844700 (LWP 31992)] > > Using 2 MPI threads > > > Using 24 OpenMP threads per tMPI thread > > > > 2 GPUs detected: > > #0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, > stat: > > compatible > > #1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, > stat: > > compatible > > > > 2 GPUs auto-selected to be used for this run: #0, #1 > > > > > > Back Off! I just backed up ctab14.xvg to ./#ctab14.xvg.1# > > Initialized GPU ID #1: GeForce GTX 580 > > [New Thread 0x7ffff3043700 (LWP 31993)] > > > > Back Off! I just backed up dtab14.xvg to ./#dtab14.xvg.1# > > > > Back Off! I just backed up rtab14.xvg to ./#rtab14.xvg.1# > > [New Thread 0x7ffff1b3c700 (LWP 31995)] > > [New Thread 0x7ffff133b700 (LWP 31996)] > > [New Thread 0x7ffff0b3a700 (LWP 31997)] > > [New Thread 0x7fffebfff700 (LWP 31998)] > > [New Thread 0x7fffeb7fe700 (LWP 31999)] > > [New Thread 0x7fffeaffd700 (LWP 32000)] > > [New Thread 0x7fffea7fc700 (LWP 32001)] > > [New Thread 0x7fffe9ffb700 (LWP 32002)] > > [New Thread 0x7fffe97fa700 (LWP 32003)] > > [New Thread 0x7fffe8ff9700 (LWP 32004)] > > [New Thread 0x7fffe87f8700 (LWP 32005)] > > [New Thread 0x7fffe7ff7700 (LWP 32006)] > > [New Thread 0x7fffe77f6700 (LWP 32007)] > > [New Thread 0x7fffe6ff5700 (LWP 32008)] > > [New Thread 0x7fffe67f4700 (LWP 32009)] > > [New Thread 0x7fffe5ff3700 (LWP 32010)] > > [New Thread 0x7fffe57f2700 (LWP 32011)] > > [New Thread 0x7fffe4ff1700 (LWP 32012)] > > [New Thread 0x7fffe47f0700 (LWP 32013)] > > [New Thread 0x7fffe3fef700 (LWP 32014)] > > [New Thread 0x7fffe37ee700 (LWP 32015)] > > [New Thread 0x7fffe2fed700 (LWP 32016)] > > [New Thread 0x7fffe27ec700 (LWP 32017)] > > Initialized GPU ID #0: GeForce GTX 580 > > > Using CUDA 8x8x8 non-bonded kernels > > > [New Thread 0x7fffe1feb700 (LWP 32018)] > > [New Thread 0x7fffe0ae4700 (LWP 32019)] > > [New Thread 0x7fffcbfff700 (LWP 32020)] > > [New Thread 0x7fffcb7fe700 (LWP 32021)] > > [New Thread 0x7fffcaffd700 (LWP 32022)] > > [New Thread 0x7fffca7fc700 (LWP 32023)] > > [New Thread 0x7fffc9ffb700 (LWP 32024)] > > [New Thread 0x7fffc97fa700 (LWP 32025)] > > [New Thread 0x7fffc8ff9700 (LWP 32026)] > > [New Thread 0x7fffc3fff700 (LWP 32027)] > > [New Thread 0x7fffc37fe700 (LWP 32028)] > > [New Thread 0x7fffc2ffd700 (LWP 32029)] > > [New Thread 0x7fffc27fc700 (LWP 32031)] > > [New Thread 0x7fffc1ffb700 (LWP 32032)] > > [New Thread 0x7fffc17fa700 (LWP 32033)] > > [New Thread 0x7fffc0ff9700 (LWP 32034)] > > [New Thread 0x7fffc07f8700 (LWP 32035)] > > [New Thread 0x7fffbfff7700 (LWP 32036)] > > [New Thread 0x7fffbf7f6700 (LWP 32037)] > > [New Thread 0x7fffbeff5700 (LWP 32038)] > > [New Thread 0x7fffbe7f4700 (LWP 32039)] > > [New Thread 0x7fffbdff3700 (LWP 32040)] > > [New Thread 0x7fffbd7f2700 (LWP 32042)] > > [New Thread 0x7fffbcff1700 (LWP 32043)] > > Making 1D domain decomposition 2 x 1 x 1 > > > > > * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING > * > > We have just committed the new CPU detection code in this > branch, > > and will commit new SSE/AVX kernels in a few days. However, > this > > means that currently only the NxN kernels are accelerated! > > > In the mean time, you might want to avoid production runs in > 4.6. > > > > > > > Back Off! I just backed up traj.trr to ./#traj.trr.1# > > > > Back Off! I just backed up traj.xtc to ./#traj.xtc.1# > > > > Back Off! I just backed up ener.edr to ./#ener.edr.1# > > starting mdrun 'Protein in water' > > > 100000 steps, 200.0 ps. > > > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fffc07f8700 (LWP 32035)] > > 0x00007ffff61de301 in nbnxn_make_pairlist.omp_fn.2 () > > from /usr/local/gromacs/bin/../lib/libmd.so.6 > > (gdb) > > > > ============================================ > > Verlet, nb by cpu only: > > > > (gdb) run -debug 1 -nb cpu -v -s test.tpr > > > > Reading file test.tpr, VERSION 4.6-dev-20121002-20da718 > (single > > precision) > > [New Thread 0x7ffff3844700 (LWP 32050)] > > [Thread 0x7ffff3844700 (LWP 32050) exited] > > [New Thread 0x7ffff3844700 (LWP 32051)] > > [Thread 0x7ffff3844700 (LWP 32051) exited] > > Starting 48 tMPI threads > > [New Thread 0x7ffff3844700 (LWP 32058)] > > [New Thread 0x7ffff3043700 (LWP 32059)] > > [New Thread 0x7ffff2842700 (LWP 32060)] > > [New Thread 0x7ffff2041700 (LWP 32061)] > > [New Thread 0x7ffff1840700 (LWP 32062)] > > [New Thread 0x7ffff103f700 (LWP 32063)] > > [New Thread 0x7ffff083e700 (LWP 32064)] > > [New Thread 0x7fffe3fff700 (LWP 32065)] > > [New Thread 0x7fffe37fe700 (LWP 32066)] > > [New Thread 0x7fffe2ffd700 (LWP 32067)] > > [New Thread 0x7fffe27fc700 (LWP 32068)] > > [New Thread 0x7fffe1ffb700 (LWP 32069)] > > [New Thread 0x7fffe17fa700 (LWP 32070)] > > [New Thread 0x7fffe0ff9700 (LWP 32071)] > > [New Thread 0x7fffdbfff700 (LWP 32072)] > > [New Thread 0x7fffdb7fe700 (LWP 32073)] > > [New Thread 0x7fffdaffd700 (LWP 32074)] > > [New Thread 0x7fffda7fc700 (LWP 32075)] > > [New Thread 0x7fffd9ffb700 (LWP 32076)] > > [New Thread 0x7fffd97fa700 (LWP 32077)] > > [New Thread 0x7fffd8ff9700 (LWP 32078)] > > [New Thread 0x7fffd3fff700 (LWP 32079)] > > [New Thread 0x7fffd37fe700 (LWP 32080)] > > [New Thread 0x7fffd2ffd700 (LWP 32081)] > > [New Thread 0x7fffd27fc700 (LWP 32082)] > > [New Thread 0x7fffd1ffb700 (LWP 32083)] > > [New Thread 0x7fffd17fa700 (LWP 32084)] > > [New Thread 0x7fffd0ff9700 (LWP 32085)] > > [New Thread 0x7fffd07f8700 (LWP 32086)] > > [New Thread 0x7fffcfff7700 (LWP 32087)] > > [New Thread 0x7fffcf7f6700 (LWP 32088)] > > [New Thread 0x7fffceff5700 (LWP 32089)] > > [New Thread 0x7fffce7f4700 (LWP 32090)] > > [New Thread 0x7fffcdff3700 (LWP 32091)] > > [New Thread 0x7fffcd7f2700 (LWP 32092)] > > [New Thread 0x7fffccff1700 (LWP 32093)] > > [New Thread 0x7fffcc7f0700 (LWP 32094)] > > [New Thread 0x7fffcbfef700 (LWP 32095)] > > [New Thread 0x7fffcb7ee700 (LWP 32096)] > > [New Thread 0x7fffcafed700 (LWP 32097)] > > [New Thread 0x7fffca7ec700 (LWP 32098)] > > [New Thread 0x7fffc9feb700 (LWP 32099)] > > [New Thread 0x7fffc97ea700 (LWP 32100)] > > [New Thread 0x7fffc8fe9700 (LWP 32101)] > > [New Thread 0x7fffc87e8700 (LWP 32102)] > > [New Thread 0x7fffc7fe7700 (LWP 32103)] > > [New Thread 0x7fffc77e6700 (LWP 32104)] > > > > Will use 45 particle-particle and 3 PME only nodes > > This is a guess, check the performance at the end of the log > file > > Using 48 MPI threads > > > Using 1 OpenMP thread per tMPI thread > > > > 2 GPUs detected: > > #0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, > stat: > > compatible > > #1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, > stat: > > compatible > > > > > > Back Off! I just backed up ctab14.xvg to ./#ctab14.xvg.2# > > > > Back Off! I just backed up dtab14.xvg to ./#dtab14.xvg.2# > > > > Back Off! I just backed up rtab14.xvg to ./#rtab14.xvg.2# > > Using SSE2 4x4 non-bonded kernels > > Making 3D domain decomposition 3 x 5 x 3 > > > > > * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING > * > > We have just committed the new CPU detection code in this > branch, > > and will commit new SSE/AVX kernels in a few days. However, > this > > means that currently only the NxN kernels are accelerated! > > > In the mean time, you might want to avoid production runs in > 4.6. > > > > > > > Back Off! I just backed up traj.trr to ./#traj.trr.2# > > > > Back Off! I just backed up traj.xtc to ./#traj.xtc.2# > > > > Back Off! I just backed up ener.edr to ./#ener.edr.2# > > starting mdrun 'Protein in water' > > > 100000 steps, 200.0 ps. > > > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fffcd7f2700 (LWP 32092)] > > 0x00007ffff61db499 in nbnxn_make_pairlist.omp_fn.2 () > > from /usr/local/gromacs/bin/../lib/libmd.so.6 > > (gdb) > > ============================================= > > > > > > On Mon, 2012-11-12 at 19:37 +0100, Szilárd Páll wrote: > > > > Hi Sebastian, > > > > > > That is very likely a bug so I'd appreciate if you could > provide a bit > > more > > > information, like: > > > > > > - OS, compiler > > > > > > - results of runs with the following configurations: > > > - "mdrun -nb cpu" (to run CPU-only with Verlet scheme) > > > - "GMX_EMULATE_GPU=1 mdrun -nb gpu" (to run GPU emulation > using plain > > C > > > kernels); > > > - "mdrun" without any arguments (which will use 2x(n/2 > cores + 1 > > GPU)) > > > - "mdrun -ntmpi 1" without any other arguments (which > will use n > > cores + > > > the first GPU) > > > > > > - please attach the log files of all failed and a > successful run as > > well as > > > the mdrun.debug file from a failed runs that you can > obtain with > > "mdrun > > > -debug 1" > > > > > > Note that a backtrace would be very useful and if you can > get one I'd > > > be grateful, but for now the above should be minimum > effort and I'll > > > provide simple introductions to get a backtrace later (if > needed). > > > > > > Thanks, > > > > > > -- > > > Szilárd > > > > > > > > > On Mon, Nov 12, 2012 at 6:22 PM, sebastian < > > > [email protected]> wrote: > > > > > > > On 11/12/2012 04:12 PM, sebastian wrote: > > > > > Dear GROMACS user, > > > > > > > > > > I am running in major problems trying to use gromacs > 4.6 on my > > desktop > > > > > with two GTX 670 GPU's and one i7 cpu. On the system I > installed > > the > > > > > CUDA 4.2, running fine for many different test > programs. > > > > > Compiling the git version of gromacs 4.6 with hybrid > acceleration > > I get > > > > > one error message of a missing libxml2 but it compiles > with no > > further > > > > > complaints. The tools I tested (like g_rdf or grompp > usw.) work > > fine as > > > > > long as I generate the tpr files with the right > gromacs version. > > > > > Now, if I try to use mdrun (GMX_GPU_ID=1 mdrun -nt 1 > -v > > -deffnm ....) > > > > > the preparation seems to work fine until it starts the > actual run. > > It > > > > > stops with a segmentation fault: > > > > > > > > > > Reading file pdz_cis_ex_200ns_test.tpr, VERSION > > > > > 4.6-dev-20121002-20da718-dirty (single precision) > > > > > > > > > > Using 1 MPI thread > > > > > > > > > > Using 1 OpenMP thread > > > > > > > > > > > > > > > 2 GPUs detected: > > > > > > > > > > #0: NVIDIA GeForce GTX 670, compute cap.: 3.0, ECC: > no, stat: > > > > compatible > > > > > > > > > > #1: NVIDIA GeForce GTX 670, compute cap.: 3.0, ECC: > no, stat: > > > > compatible > > > > > > > > > > > > > > > 1 GPU user-selected to be used for this run: #1 > > > > > > > > > > > > > > > Using CUDA 8x8x8 non-bonded kernels > > > > > > > > > > > > > > > * WARNING * WARNING * WARNING * WARNING * WARNING * > WARNING * > > > > > > > > > > We have just committed the new CPU detection code in > this branch, > > > > > > > > > > and will commit new SSE/AVX kernels in a few days. > However, this > > > > > > > > > > means that currently only the NxN kernels are > accelerated! > > > > > > > > > > > > > Since it does run on a pure CPU run (without the verlet > cut-off > > scheme) > > > > does it maybe help to change the NxN kernels manually > in the .mdp > > file > > > > (how can I do so)? Or is there something wrong using the > CUDA 4.2 > > > > version or what so ever. The libxml2 should not be a > problem since > > the > > > > pure CPU run works. > > > > > > > > > In the mean time, you might want to avoid production > runs in 4.6. > > > > > > > > > > > > > > > Back Off! I just backed up pdz_cis_ex_200ns_test.trr > to > > > > > ./#pdz_cis_ex_200ns_test.trr.4# > > > > > > > > > > > > > > > Back Off! I just backed up pdz_cis_ex_200ns_test.xtc > to > > > > > ./#pdz_cis_ex_200ns_test.xtc.4# > > > > > > > > > > > > > > > Back Off! I just backed up pdz_cis_ex_200ns_test.edr > to > > > > > ./#pdz_cis_ex_200ns_test.edr.4# > > > > > > > > > > starting mdrun 'Protein in water' > > > > > > > > > > 3500000 steps, 7000.0 ps. > > > > > > > > > > Segmentation fault > > > > > > > > > > > > > > > Since I have no idea whats going wrong any help is > welcomed. > > > > > Attached you find the log file. > > > > > > > > > > > > > Help is really appreciated since I want to use my new > desktop > > including > > > > the GPU's > > > > > > > > > Thanks a lot > > > > > > > > > > Sebastian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > gmx-users mailing list [email protected] > > > > http://lists.gromacs.org/mailman/listinfo/gmx-users > > > > * Please search the archive at > > > > http://www.gromacs.org/Support/Mailing_Lists/Search > before posting! > > > > * Please don't post (un)subscribe requests to the list. > Use the > > > > www interface or send it to > [email protected]. > > > > * Can't post? Read > http://www.gromacs.org/Support/Mailing_Lists > > > > > > > -- > > > gmx-users mailing list [email protected] > > > http://lists.gromacs.org/mailman/listinfo/gmx-users > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/Search before > posting! > > > * Please don't post (un)subscribe requests to the list. > Use the > > > www interface or send it to [email protected]. > > > * Can't post? Read > http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > -- > > gmx-users mailing list [email protected] > > http://lists.gromacs.org/mailman/listinfo/gmx-users > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/Search before > posting! > > * Please don't post (un)subscribe requests to the list. Use > the > > www interface or send it to [email protected]. > > * Can't post? Read > http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > > > > > > > -- > ORNL/UT Center for Molecular Biophysics cmb.ornl.gov > 865-241-1537, ORNL PO BOX 2008 MS6309 > -- > gmx-users mailing list [email protected] > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before > posting! > * Please don't post (un)subscribe requests to the list. Use > the > www interface or send it to [email protected]. > * Can't post? Read > http://www.gromacs.org/Support/Mailing_Lists > > > -- gmx-users mailing list [email protected] http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to [email protected]. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

