On Sat, Jul 26, 2014 at 1:53 PM, Mark Abraham <mark.j.abra...@gmail.com> wrote:
> On Sat, Jul 26, 2014 at 7:35 PM, Seyyed Mohtadin Hashemi <haa...@gmail.com > > > wrote: > > > On Jul 26, 2014 4:52 AM, "Mark Abraham" <mark.j.abra...@gmail.com> > wrote: > > > > > > Hi, > > > > > > That is indeed very weird - particularly if compiling on the compute > > nodes > > > with GPU support enabled gives the same result. Both host and compute > > nodes > > > support rdtscp, so that known suspect is OK. I can only guess that > > there's > > > something in the CUDA installation process that targets the CPU on the > > > install host. Configuring with -DCMAKE_BUILD_TYPE=Debug and getting a > > stack > > > trace from the crash might help work out where the problem arises. > > > > Will get back with the stack trace a bit later; is "gdb bt full" ok? > > > Probably. > > Mark > > > > Or do > > you want "thread info" as well? > > > > Doing a > > > CUDA install on a compute node and compiling against that might help. > > > > You mean install CUDA SDK on the worker nodes? If so, this is already > done > > and gives same result. I will ask the admin about the configuration of > CUDA > > on the worker nodes. > > > > > > Mark > > > > > > > > > > > > On Fri, Jul 25, 2014 at 10:00 PM, Seyyed Mohtadin Hashemi < > > haa...@gmail.com> > > > wrote: > > > > > > > Hi everyone, > > > > > > > > I'm having a very weird problem with GROMACS 4.6.6: > > > > > > > > I am currently testing out GPU capabilities and was trying to compile > > > > GROMACS with CUDA (v6.0). I can not make this work if I compile > GROMACS > > > > with SIMD, no matter what kernel I choose - I have tried everything > > from > > > > SSE2 to AVX_256. > > > > > > > > The log-in node, where I compile, has AMD Interlagos CPUs (worker > nodes > > use > > > > Xeon E5-2630 and are equipped with Tesla K20), but I do not think > this > > is > > > > the problem - I have compiled GROMACS, using the log-in node, without > > CUDA > > > > but with AVX_256 SIMD and everything works. As soon as CUDA is added > to > > the > > > > mix, I get "Illegal Instruction" every time I try to run on the > worker > > > > nodes. > > > > > > > > Compiling on worker nodes gives the same result. However, as soon as > I > > set > > > > SIMD=None everything works and I am able to run simulation using > GPUs, > > this > > > > is regardless of if I use log-in node or worker node to compile. > > > > > > > > > > > > The cmake string used to configure is: > > > > ccmake .. -DCMAKE_INSTALL_PREFIX=/work/gromacs4gpu -DGMX_DOUBLE=OFF > > > > -DGMX_DEFAULT_SUFFIX=OFF -DGMX_BINARY_SUFFIX=_4gpu > > -DGMX_LIBS_SUFFIX=_4gpu > > > > -DGMX_GPU=ON -DBUILD_SHARED_LIBS=OFF -DGMX_PREFER_STATIC_LIBS=ON > > > > -GMX_MPI=OFF -DGMX_CPU_ACCELERATION=AVX_256 > > > > > > > > CUDA v6.0 and FFTW v3.3.4 (single precision) libs are set globally > and > > > > correctly identified by GROMACS. To remove OpenMPI as a problem I am > > > > compiling without it (compiling with OpenMPI produced the same > behavior > > as > > > > without), once I have found the error I will compile with OpenMPI > > v1.6.5. > > > > > > > > I get these warnings during the configuration, nothing important: > > > > > > > > A BLAS library was not found by CMake in the paths available to it. > > > > Falling back on the GROMACS internal version of the BLAS library > > instead. > > > > This is fine for normal usage. > > > > > > > > A LAPACK library was not found by CMake in the paths available to > it. > > > > Falling back on the GROMACS internal version of the LAPACK library > > instead. > > > > This is fine for normal usage. > > > > > > > > I am currently trying to compile and test GROMACS 5.0 to see if it > also > > > > exhibits the same behavior. > > > > > > > > I hope that someone can point me in the direction of a possible > > solution, > > > > if not then I will file a bug report. > > > > > > > > Regards, > > > > Mohtadin > > > > -- > > > > Gromacs Users mailing list > > > > > > > > * Please search the archive at > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > > > posting! > > > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > > > * For (un)subscribe requests visit > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users > or > > > > send a mail to gmx-users-requ...@gromacs.org. > > > > > > > -- > > > Gromacs Users mailing list > > > > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > posting! > > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > > > * For (un)subscribe requests visit > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > send a mail to gmx-users-requ...@gromacs.org. > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > > posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > > send a mail to gmx-users-requ...@gromacs.org. > > > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > A possible, but very strange, solution was found over the weekend: To find out which step causes problem I compiled single precision GROMACS v5.0 (using the Debug profile) with no GPU support, no SIMD, and no MPI. As was expected, everything worked. I then compiled mdrun only with GPU support and Debug (but still no SIMD and no MPI); again everything was working. Next step was to compile a new mdrun with GPU and SIMD (still no MPI) - it worked! I tried with SSE2, SSE4.1, and AVX_256 - all work. As the last step I added MPI, and again everything works! So I went back and made a new compilation with all options (i.e. AVX_256, MPI, and GPU), still using Debug profile - and lo and behold, everything works. However, if I configure/compile using the Release profile nothing works. (To be sure that I did not have a corrupt package, I re-downloaded the package. MD5 sum matched with sum on website.) Hope this can narrow down what is wrong. At least now, I have a working system that I can run some tests on. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.