Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-13 Thread Alex
Just a quick update here regarding regression tests. On an old machine with
a single puny GTX 960, the 2018 build passes all tests with

Start  9: GpuUtilsUnitTests
 9/39 Test  #9: GpuUtilsUnitTests    Passed5.64 sec

Hope this is useful.

Alex
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-09 Thread Szilárd Páll
Great to hear!

(Also note that one thing we have explicitly focused on is not only peak
performance, but to get as close to peak as possible with just a few CPU
cores! You should be able to get >75% perf with just 3-5 Xeon or 2-3
desktop cores rather than needing a full fast CPU.)

--
Szilárd

On Thu, Feb 8, 2018 at 8:44 PM, Alex  wrote:

> With -pme gpu, I am reporting 383.032 ns/day vs 270 ns/day with the 2016.4
> version. I _did not_ mistype. The system is close to a cubic box of water
> with some ions.
>
> Incredible.
>
> Alex
>
> On Thu, Feb 8, 2018 at 12:27 PM, Szilárd Páll 
> wrote:
>
> > Note that the actual mdrun performance need not be affected both of it's
> > it's a driver persistence issue (you'll just see a few seconds lag at
> mdrun
> > startup) or some other CUDA application startup-related lag (an mdrun run
> > does mostly very different kind of things than this set of particular
> unit
> > tests).
> >
> > --
> > Szilárd
> >
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
With -pme gpu, I am reporting 383.032 ns/day vs 270 ns/day with the 2016.4
version. I _did not_ mistype. The system is close to a cubic box of water
with some ions.

Incredible.

Alex

On Thu, Feb 8, 2018 at 12:27 PM, Szilárd Páll 
wrote:

> Note that the actual mdrun performance need not be affected both of it's
> it's a driver persistence issue (you'll just see a few seconds lag at mdrun
> startup) or some other CUDA application startup-related lag (an mdrun run
> does mostly very different kind of things than this set of particular unit
> tests).
>
> --
> Szilárd
>
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Szilárd Páll
Note that the actual mdrun performance need not be affected both of it's
it's a driver persistence issue (you'll just see a few seconds lag at mdrun
startup) or some other CUDA application startup-related lag (an mdrun run
does mostly very different kind of things than this set of particular unit
tests).

--
Szilárd

On Thu, Feb 8, 2018 at 7:40 PM, Alex  wrote:

>  I keep getting bounce messages from the list, so in case things didn't get
> posted...
>
> 1. We enabled PM -- still times out.
> 2. 3-4 days ago we had very fast runs with GPU (2016.4), so I don't know if
> we miraculously broke everything to the point where our $25K box performs
> worse than Mark's laptop. That in itself might be publishable...
> 3. I will run tests on a system, for which I know the performance with
> 2016.4 so we can compare, especially with -pme gpu
>
> Thanks,
>
> Alex
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
 I keep getting bounce messages from the list, so in case things didn't get
posted...

1. We enabled PM -- still times out.
2. 3-4 days ago we had very fast runs with GPU (2016.4), so I don't know if
we miraculously broke everything to the point where our $25K box performs
worse than Mark's laptop. That in itself might be publishable...
3. I will run tests on a system, for which I know the performance with
2016.4 so we can compare, especially with -pme gpu

Thanks,

Alex
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
On Thu, Feb 8, 2018 at 6:54 PM Szilárd Páll  wrote:

> BTW, do you have persistence mode (PM) set (see in the nvidia-smi output)?
> If you do not have PM it set nor is there an X server that keeps the driver
> loaded, the driver gets loaded every time a CUDA application is started.
> This could be causing the lag which shows up as long execution time for our
> rather simple unit tests that should take milliseconds rather than seconds
> when PM is on (or X is running).
>

The earlier report of bin/gpu_utils-test showed that many tests were slow,
not just at startup. See immediately below

Mark
 > > > > > > > Here you are:

> >> > > > > > >
> >> > > > > > > [==] Running 35 tests from 7 test cases.
> >> > > > > > > [--] Global test environment set-up.
> >> > > > > > > [--] 7 tests from HostAllocatorTest/0, where
> >> TypeParam =
> >> > > int
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> (5457
> >> > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/0.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > (2861 ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> >> > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
> >> > (3254
> >> > > > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.FillInputA
> >> lsoWorksAfterCalling
> >> > > > Reserve
> >> > > > > > > [   OK ] HostAllocatorTest/0.FillInputA
> >> lsoWorksAfterCalling
> >> > > > Reserve
> >> > > > > > > (2221 ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersW
> >> ithPinningWorkWithCu
> >> > da
> >> > > > > > > [   OK ] HostAllocatorTest/0.TransfersW
> >> ithPinningWorkWithCu
> >> > da
> >> > > > (3801
> >> > > > > > ms)
> >> > > > > > > [ RUN  ]
> >> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> >> > > > > > > [   OK ]
> >> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> >> > > > > > (2157
> >> > > > > > > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> >> > > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> >> > (2179
> >> > > > ms)
> >> > > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms
> total)
> >> > > > > > >
> >> > > > > > > [--] 7 tests from HostAllocatorTest/1, where
> >> TypeParam =
> >> > > > float
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> (2739
> >> > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/1.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > (2731 ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> >> > > > > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork
> >> > (3281
> >> > > > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.FillInputA
> >> lsoWorksAfterCalling
> >> > > > Reserve
> >> > > > > > > [   OK ] HostAllocatorTest/1.FillInputA
> >> lsoWorksAfterCalling
> >> > > > Reserve
> >> > > > > > > (2164 ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersW
> >> ithPinningWorkWithCu
> >> > da
> >> > > > > > > [   OK ] HostAllocatorTest/1.TransfersW
> >> ithPinningWorkWithCu
> >> > da
> >> > > > (3908
> >> > > > > > ms)
> >> > > > > > > [ RUN  ]
> >> > > HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> >> > > > > > > [   OK ]
> >> > > HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> >> > > > > > (2202
> >> > > > > > > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> >> > > > > > > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> >> > (2261
> >> > > > ms)
> >> > > > > > > [--] 7 tests from HostAllocatorTest/1 (19287 ms
> total)
> >> > > > > > >
> >> > > > > > > [--] 7 tests from HostAllocatorTest/2, where
> >> TypeParam =
> >> > > > > > > gmx::BasicVector
> >> > > > > > > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> (2771
> >> > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/2.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > [   OK ] HostAllocatorTest/2.VectorsWit
> >> hDefaultHostAllocato
> >> > > > > > rAlwaysWorks
> >> > > > > > > (2846 ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> >> > > > > > > [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork
> >> > (3283
> >> > > > ms)
> >> > > > > > > [ RUN  ] HostAllocatorTest/2.FillInputA
> >> lsoWorksAfterCalling
> 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
Got it. Given all the messing around, I am rebuilding GMX and if make check
results are the same, will install. We have an angry postdoc here demanding
tools.

Thank you gentlemen.

Alex

On Thu, Feb 8, 2018 at 10:50 AM, Szilárd Páll 
wrote:

> On Thu, Feb 8, 2018 at 6:46 PM, Alex  wrote:
>
> > Are you suggesting that i should accept these results and install the
> 2018
> > version?
> >
>
> Yes, your GROMACS build seems fine.
>
> make check simply runs the test that I suggested you to run manually (and
> which successfully finished). The 30 s timeout on CMake tests interrupts
> this set of unit tests given your unusually long execution time which is
> the reason for the failure.
>
>
> >
> > Thanks,
> >
> > Alex
> >
> > On Thu, Feb 8, 2018 at 10:43 AM, Mark Abraham 
> > wrote:
> >
> > > Hi,
> > >
> > > PATH doesn't matter, only what ldd thinks matters.
> > >
> > > I have opened https://redmine.gromacs.org/issues/2405 to address that
> > the
> > > implementation of these tests are perhaps proving more pain than
> > usefulness
> > > (from this thread and others I have seen).
> > >
> > > Mark
> > >
> > > On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:
> > >
> > > > That is quite weird. We found that I have PATH values pointing to the
> > old
> > > > gmx installation while running these tests. Do you think that could
> > cause
> > > > issues?
> > > >
> > > > Alex
> > > >
> > > > On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham <
> > mark.j.abra...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Great. The manual run took 74.5 seconds, failing the 30 second
> > timeout.
> > > > So
> > > > > the code is fine.
> > > > >
> > > > > But you have some crazy large overhead going on - gpu_utils-test
> runs
> > > in
> > > > 7s
> > > > > on my 2013 desktop with CUDA 9.1.
> > > > >
> > > > > Mark
> > > > >
> > > > > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
> > > > >
> > > > > > uh, no sir.
> > > > > >
> > > > > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout
> > 30.43
> > > > sec
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
> > > > mark.j.abra...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Those all succeeded. Does make check now also succeed?
> > > > > > >
> > > > > > > Mark
> > > > > > >
> > > > > > > On Thu, Feb 8, 2018 at 6:24 PM Alex 
> wrote:
> > > > > > >
> > > > > > > > Here you are:
> > > > > > > >
> > > > > > > > [==] Running 35 tests from 7 test cases.
> > > > > > > > [--] Global test environment set-up.
> > > > > > > > [--] 7 tests from HostAllocatorTest/0, where
> TypeParam
> > =
> > > > int
> > > > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> (5457
> > > ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/0.
> > VectorsWithDefaultHostAllocato
> > > > > > > rAlwaysWorks
> > > > > > > > [   OK ] HostAllocatorTest/0.
> > VectorsWithDefaultHostAllocato
> > > > > > > rAlwaysWorks
> > > > > > > > (2861 ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > (3254
> > > > > ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/0.
> > FillInputAlsoWorksAfterCalling
> > > > > Reserve
> > > > > > > > [   OK ] HostAllocatorTest/0.
> > FillInputAlsoWorksAfterCalling
> > > > > Reserve
> > > > > > > > (2221 ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/0.
> > TransfersWithPinningWorkWithCu
> > > da
> > > > > > > > [   OK ] HostAllocatorTest/0.
> > TransfersWithPinningWorkWithCu
> > > da
> > > > > (3801
> > > > > > > ms)
> > > > > > > > [ RUN  ]
> > > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > > > [   OK ]
> > > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > > (2157
> > > > > > > > ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > (2179
> > > > > ms)
> > > > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms
> total)
> > > > > > > >
> > > > > > > > [--] 7 tests from HostAllocatorTest/1, where
> TypeParam
> > =
> > > > > float
> > > > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> (2739
> > > ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/1.
> > VectorsWithDefaultHostAllocato
> > > > > > > rAlwaysWorks
> > > > > > > > [   OK ] HostAllocatorTest/1.
> > VectorsWithDefaultHostAllocato
> > > > > > > rAlwaysWorks
> > > > > > > > (2731 ms)
> > > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > > > > > [   OK ] 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Szilárd Páll
BTW, do you have persistence mode (PM) set (see in the nvidia-smi output)?
If you do not have PM it set nor is there an X server that keeps the driver
loaded, the driver gets loaded every time a CUDA application is started.
This could be causing the lag which shows up as long execution time for our
rather simple unit tests that should take milliseconds rather than seconds
when PM is on (or X is running).

--
Szilárd

On Thu, Feb 8, 2018 at 6:50 PM, Szilárd Páll  wrote:

> On Thu, Feb 8, 2018 at 6:46 PM, Alex  wrote:
>
>> Are you suggesting that i should accept these results and install the 2018
>> version?
>>
>
> Yes, your GROMACS build seems fine.
>
> make check simply runs the test that I suggested you to run manually (and
> which successfully finished). The 30 s timeout on CMake tests interrupts
> this set of unit tests given your unusually long execution time which is
> the reason for the failure.
>
>
>>
>> Thanks,
>>
>> Alex
>>
>> On Thu, Feb 8, 2018 at 10:43 AM, Mark Abraham 
>> wrote:
>>
>> > Hi,
>> >
>> > PATH doesn't matter, only what ldd thinks matters.
>> >
>> > I have opened https://redmine.gromacs.org/issues/2405 to address that
>> the
>> > implementation of these tests are perhaps proving more pain than
>> usefulness
>> > (from this thread and others I have seen).
>> >
>> > Mark
>> >
>> > On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:
>> >
>> > > That is quite weird. We found that I have PATH values pointing to the
>> old
>> > > gmx installation while running these tests. Do you think that could
>> cause
>> > > issues?
>> > >
>> > > Alex
>> > >
>> > > On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham <
>> mark.j.abra...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > Great. The manual run took 74.5 seconds, failing the 30 second
>> timeout.
>> > > So
>> > > > the code is fine.
>> > > >
>> > > > But you have some crazy large overhead going on - gpu_utils-test
>> runs
>> > in
>> > > 7s
>> > > > on my 2013 desktop with CUDA 9.1.
>> > > >
>> > > > Mark
>> > > >
>> > > > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
>> > > >
>> > > > > uh, no sir.
>> > > > >
>> > > > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout
>> 30.43
>> > > sec
>> > > > >
>> > > > >
>> > > > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
>> > > mark.j.abra...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > Those all succeeded. Does make check now also succeed?
>> > > > > >
>> > > > > > Mark
>> > > > > >
>> > > > > > On Thu, Feb 8, 2018 at 6:24 PM Alex 
>> wrote:
>> > > > > >
>> > > > > > > Here you are:
>> > > > > > >
>> > > > > > > [==] Running 35 tests from 7 test cases.
>> > > > > > > [--] Global test environment set-up.
>> > > > > > > [--] 7 tests from HostAllocatorTest/0, where
>> TypeParam =
>> > > int
>> > > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
>> > > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457
>> > ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/0.VectorsWit
>> hDefaultHostAllocato
>> > > > > > rAlwaysWorks
>> > > > > > > [   OK ] HostAllocatorTest/0.VectorsWit
>> hDefaultHostAllocato
>> > > > > > rAlwaysWorks
>> > > > > > > (2861 ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
>> > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
>> > (3254
>> > > > ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/0.FillInputA
>> lsoWorksAfterCalling
>> > > > Reserve
>> > > > > > > [   OK ] HostAllocatorTest/0.FillInputA
>> lsoWorksAfterCalling
>> > > > Reserve
>> > > > > > > (2221 ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersW
>> ithPinningWorkWithCu
>> > da
>> > > > > > > [   OK ] HostAllocatorTest/0.TransfersW
>> ithPinningWorkWithCu
>> > da
>> > > > (3801
>> > > > > > ms)
>> > > > > > > [ RUN  ]
>> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
>> > > > > > > [   OK ]
>> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
>> > > > > > (2157
>> > > > > > > ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
>> > > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
>> > (2179
>> > > > ms)
>> > > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
>> > > > > > >
>> > > > > > > [--] 7 tests from HostAllocatorTest/1, where
>> TypeParam =
>> > > > float
>> > > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
>> > > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739
>> > ms)
>> > > > > > > [ RUN  ] HostAllocatorTest/1.VectorsWit
>> hDefaultHostAllocato
>> > > > > > rAlwaysWorks
>> > > > > > > [   OK ] HostAllocatorTest/1.VectorsWit
>> hDefaultHostAllocato
>> > > > > > rAlwaysWorks
>> > > > > > > (2731 ms)
>> > > > > > > [ RUN  ] 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

Assuming the other test binary has the same behaviour (succeeds when run
manually), then the build is working correctly and you could install it for
general use. But I suspect its performance will suffer from whatever is
causing the slowdown (e.g. compare with old numbers). That's not really a
topic for this list, however.

Mark

On Thu, Feb 8, 2018 at 6:47 PM Alex  wrote:

> Are you suggesting that i should accept these results and install the 2018
> version?
>
> Thanks,
>
> Alex
>
> On Thu, Feb 8, 2018 at 10:43 AM, Mark Abraham 
> wrote:
>
> > Hi,
> >
> > PATH doesn't matter, only what ldd thinks matters.
> >
> > I have opened https://redmine.gromacs.org/issues/2405 to address that
> the
> > implementation of these tests are perhaps proving more pain than
> usefulness
> > (from this thread and others I have seen).
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:
> >
> > > That is quite weird. We found that I have PATH values pointing to the
> old
> > > gmx installation while running these tests. Do you think that could
> cause
> > > issues?
> > >
> > > Alex
> > >
> > > On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham <
> mark.j.abra...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Great. The manual run took 74.5 seconds, failing the 30 second
> timeout.
> > > So
> > > > the code is fine.
> > > >
> > > > But you have some crazy large overhead going on - gpu_utils-test runs
> > in
> > > 7s
> > > > on my 2013 desktop with CUDA 9.1.
> > > >
> > > > Mark
> > > >
> > > > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
> > > >
> > > > > uh, no sir.
> > > > >
> > > > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout
> 30.43
> > > sec
> > > > >
> > > > >
> > > > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
> > > mark.j.abra...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Those all succeeded. Does make check now also succeed?
> > > > > >
> > > > > > Mark
> > > > > >
> > > > > > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> > > > > >
> > > > > > > Here you are:
> > > > > > >
> > > > > > > [==] Running 35 tests from 7 test cases.
> > > > > > > [--] Global test environment set-up.
> > > > > > > [--] 7 tests from HostAllocatorTest/0, where TypeParam
> =
> > > int
> > > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457
> > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > (2861 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > (3254
> > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > (2221 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCu
> > da
> > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCu
> > da
> > > > (3801
> > > > > > ms)
> > > > > > > [ RUN  ]
> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > > [   OK ]
> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > (2157
> > > > > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > (2179
> > > > ms)
> > > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > > > > > >
> > > > > > > [--] 7 tests from HostAllocatorTest/1, where TypeParam
> =
> > > > float
> > > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739
> > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > (2731 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > (3281
> > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > (2164 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCu
> > da
> > > > > > > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCu
> > da
> > > > (3908
> > > > > > ms)
> > > > > > > [ RUN  ]
> > > 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Szilárd Páll
On Thu, Feb 8, 2018 at 6:46 PM, Alex  wrote:

> Are you suggesting that i should accept these results and install the 2018
> version?
>

Yes, your GROMACS build seems fine.

make check simply runs the test that I suggested you to run manually (and
which successfully finished). The 30 s timeout on CMake tests interrupts
this set of unit tests given your unusually long execution time which is
the reason for the failure.


>
> Thanks,
>
> Alex
>
> On Thu, Feb 8, 2018 at 10:43 AM, Mark Abraham 
> wrote:
>
> > Hi,
> >
> > PATH doesn't matter, only what ldd thinks matters.
> >
> > I have opened https://redmine.gromacs.org/issues/2405 to address that
> the
> > implementation of these tests are perhaps proving more pain than
> usefulness
> > (from this thread and others I have seen).
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:
> >
> > > That is quite weird. We found that I have PATH values pointing to the
> old
> > > gmx installation while running these tests. Do you think that could
> cause
> > > issues?
> > >
> > > Alex
> > >
> > > On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham <
> mark.j.abra...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Great. The manual run took 74.5 seconds, failing the 30 second
> timeout.
> > > So
> > > > the code is fine.
> > > >
> > > > But you have some crazy large overhead going on - gpu_utils-test runs
> > in
> > > 7s
> > > > on my 2013 desktop with CUDA 9.1.
> > > >
> > > > Mark
> > > >
> > > > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
> > > >
> > > > > uh, no sir.
> > > > >
> > > > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout
> 30.43
> > > sec
> > > > >
> > > > >
> > > > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
> > > mark.j.abra...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Those all succeeded. Does make check now also succeed?
> > > > > >
> > > > > > Mark
> > > > > >
> > > > > > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> > > > > >
> > > > > > > Here you are:
> > > > > > >
> > > > > > > [==] Running 35 tests from 7 test cases.
> > > > > > > [--] Global test environment set-up.
> > > > > > > [--] 7 tests from HostAllocatorTest/0, where TypeParam
> =
> > > int
> > > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457
> > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.
> VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/0.
> VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > (2861 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > (3254
> > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.
> FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > [   OK ] HostAllocatorTest/0.
> FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > (2221 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.
> TransfersWithPinningWorkWithCu
> > da
> > > > > > > [   OK ] HostAllocatorTest/0.
> TransfersWithPinningWorkWithCu
> > da
> > > > (3801
> > > > > > ms)
> > > > > > > [ RUN  ]
> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > > [   OK ]
> > > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > (2157
> > > > > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > (2179
> > > > ms)
> > > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > > > > > >
> > > > > > > [--] 7 tests from HostAllocatorTest/1, where TypeParam
> =
> > > > float
> > > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739
> > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.
> VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > [   OK ] HostAllocatorTest/1.
> VectorsWithDefaultHostAllocato
> > > > > > rAlwaysWorks
> > > > > > > (2731 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > (3281
> > > > ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.
> FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > [   OK ] HostAllocatorTest/1.
> FillInputAlsoWorksAfterCalling
> > > > Reserve
> > > > > > > (2164 ms)
> > > > > > > [ RUN  ] HostAllocatorTest/1.
> TransfersWithPinningWorkWithCu
> > da
> > > > > > > [   OK ] HostAllocatorTest/1.
> TransfersWithPinningWorkWithCu
> > da
> > > > (3908
> > > > > > ms)
> > > > > > > [ RUN  ]
> > > HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
Are you suggesting that i should accept these results and install the 2018
version?

Thanks,

Alex

On Thu, Feb 8, 2018 at 10:43 AM, Mark Abraham 
wrote:

> Hi,
>
> PATH doesn't matter, only what ldd thinks matters.
>
> I have opened https://redmine.gromacs.org/issues/2405 to address that the
> implementation of these tests are perhaps proving more pain than usefulness
> (from this thread and others I have seen).
>
> Mark
>
> On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:
>
> > That is quite weird. We found that I have PATH values pointing to the old
> > gmx installation while running these tests. Do you think that could cause
> > issues?
> >
> > Alex
> >
> > On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham 
> > wrote:
> >
> > > Hi,
> > >
> > > Great. The manual run took 74.5 seconds, failing the 30 second timeout.
> > So
> > > the code is fine.
> > >
> > > But you have some crazy large overhead going on - gpu_utils-test runs
> in
> > 7s
> > > on my 2013 desktop with CUDA 9.1.
> > >
> > > Mark
> > >
> > > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
> > >
> > > > uh, no sir.
> > > >
> > > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.43
> > sec
> > > >
> > > >
> > > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
> > mark.j.abra...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Those all succeeded. Does make check now also succeed?
> > > > >
> > > > > Mark
> > > > >
> > > > > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> > > > >
> > > > > > Here you are:
> > > > > >
> > > > > > [==] Running 35 tests from 7 test cases.
> > > > > > [--] Global test environment set-up.
> > > > > > [--] 7 tests from HostAllocatorTest/0, where TypeParam =
> > int
> > > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457
> ms)
> > > > > > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > > rAlwaysWorks
> > > > > > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > > rAlwaysWorks
> > > > > > (2861 ms)
> > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork
> (3254
> > > ms)
> > > > > > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > > Reserve
> > > > > > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > > Reserve
> > > > > > (2221 ms)
> > > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCu
> da
> > > > > > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCu
> da
> > > (3801
> > > > > ms)
> > > > > > [ RUN  ]
> > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > > [   OK ]
> > HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > (2157
> > > > > > ms)
> > > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> (2179
> > > ms)
> > > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > > > > >
> > > > > > [--] 7 tests from HostAllocatorTest/1, where TypeParam =
> > > float
> > > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739
> ms)
> > > > > > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > > rAlwaysWorks
> > > > > > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > > rAlwaysWorks
> > > > > > (2731 ms)
> > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork
> (3281
> > > ms)
> > > > > > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > > Reserve
> > > > > > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > > Reserve
> > > > > > (2164 ms)
> > > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCu
> da
> > > > > > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCu
> da
> > > (3908
> > > > > ms)
> > > > > > [ RUN  ]
> > HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > > > [   OK ]
> > HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > > (2202
> > > > > > ms)
> > > > > > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> > > > > > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> (2261
> > > ms)
> > > > > > [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
> > > > > >
> > > > > > [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> > > > > > gmx::BasicVector
> > > > > > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> > > > > > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771
> ms)
> > > > > > [ RUN  ] 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

PATH doesn't matter, only what ldd thinks matters.

I have opened https://redmine.gromacs.org/issues/2405 to address that the
implementation of these tests are perhaps proving more pain than usefulness
(from this thread and others I have seen).

Mark

On Thu, Feb 8, 2018 at 6:41 PM Alex  wrote:

> That is quite weird. We found that I have PATH values pointing to the old
> gmx installation while running these tests. Do you think that could cause
> issues?
>
> Alex
>
> On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham 
> wrote:
>
> > Hi,
> >
> > Great. The manual run took 74.5 seconds, failing the 30 second timeout.
> So
> > the code is fine.
> >
> > But you have some crazy large overhead going on - gpu_utils-test runs in
> 7s
> > on my 2013 desktop with CUDA 9.1.
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
> >
> > > uh, no sir.
> > >
> > > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.43
> sec
> > >
> > >
> > > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham <
> mark.j.abra...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Those all succeeded. Does make check now also succeed?
> > > >
> > > > Mark
> > > >
> > > > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> > > >
> > > > > Here you are:
> > > > >
> > > > > [==] Running 35 tests from 7 test cases.
> > > > > [--] Global test environment set-up.
> > > > > [--] 7 tests from HostAllocatorTest/0, where TypeParam =
> int
> > > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
> > > > > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > (2861 ms)
> > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254
> > ms)
> > > > > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > Reserve
> > > > > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> > Reserve
> > > > > (2221 ms)
> > > > > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> > > > > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> > (3801
> > > > ms)
> > > > > [ RUN  ]
> HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > > [   OK ]
> HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > (2157
> > > > > ms)
> > > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179
> > ms)
> > > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > > > >
> > > > > [--] 7 tests from HostAllocatorTest/1, where TypeParam =
> > float
> > > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
> > > > > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > (2731 ms)
> > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281
> > ms)
> > > > > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > Reserve
> > > > > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> > Reserve
> > > > > (2164 ms)
> > > > > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> > > > > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> > (3908
> > > > ms)
> > > > > [ RUN  ]
> HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > > [   OK ]
> HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > (2202
> > > > > ms)
> > > > > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> > > > > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261
> > ms)
> > > > > [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
> > > > >
> > > > > [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> > > > > gmx::BasicVector
> > > > > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
> > > > > [ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > [   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > > > rAlwaysWorks
> > > > > (2846 ms)
> > > > > [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> > > > > [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283
> > ms)
> > > > > [ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCalling
> > Reserve
> > > > > [   OK ] 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
That is quite weird. We found that I have PATH values pointing to the old
gmx installation while running these tests. Do you think that could cause
issues?

Alex

On Thu, Feb 8, 2018 at 10:36 AM, Mark Abraham 
wrote:

> Hi,
>
> Great. The manual run took 74.5 seconds, failing the 30 second timeout. So
> the code is fine.
>
> But you have some crazy large overhead going on - gpu_utils-test runs in 7s
> on my 2013 desktop with CUDA 9.1.
>
> Mark
>
> On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:
>
> > uh, no sir.
> >
> > >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.43 sec
> >
> >
> > On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham 
> > wrote:
> >
> > > Hi,
> > >
> > > Those all succeeded. Does make check now also succeed?
> > >
> > > Mark
> > >
> > > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> > >
> > > > Here you are:
> > > >
> > > > [==] Running 35 tests from 7 test cases.
> > > > [--] Global test environment set-up.
> > > > [--] 7 tests from HostAllocatorTest/0, where TypeParam = int
> > > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
> > > > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > (2861 ms)
> > > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254
> ms)
> > > > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> Reserve
> > > > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCalling
> Reserve
> > > > (2221 ms)
> > > > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> > > > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> (3801
> > > ms)
> > > > [ RUN  ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > > [   OK ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > (2157
> > > > ms)
> > > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179
> ms)
> > > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > > >
> > > > [--] 7 tests from HostAllocatorTest/1, where TypeParam =
> float
> > > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
> > > > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > (2731 ms)
> > > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281
> ms)
> > > > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> Reserve
> > > > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCalling
> Reserve
> > > > (2164 ms)
> > > > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> > > > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> (3908
> > > ms)
> > > > [ RUN  ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > > [   OK ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > (2202
> > > > ms)
> > > > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> > > > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261
> ms)
> > > > [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
> > > >
> > > > [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> > > > gmx::BasicVector
> > > > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> > > > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
> > > > [ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > [   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > > rAlwaysWorks
> > > > (2846 ms)
> > > > [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> > > > [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283
> ms)
> > > > [ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCalling
> Reserve
> > > > [   OK ] HostAllocatorTest/2.FillInputAlsoWorksAfterCalling
> Reserve
> > > > (2131 ms)
> > > > [ RUN  ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
> > > > [   OK ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
> (3833
> > > ms)
> > > > [ RUN  ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> > > > [   OK ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> > > (2232
> > > > ms)
> > > > [ RUN  ] HostAllocatorTest/2.StatefulAllocatorUsesMemory
> > > > [   OK ] HostAllocatorTest/2.StatefulAllocatorUsesMemory (2164
> ms)
> > > > [--] 7 tests 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

Great. The manual run took 74.5 seconds, failing the 30 second timeout. So
the code is fine.

But you have some crazy large overhead going on - gpu_utils-test runs in 7s
on my 2013 desktop with CUDA 9.1.

Mark

On Thu, Feb 8, 2018 at 6:29 PM Alex  wrote:

> uh, no sir.
>
> >  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.43 sec
>
>
> On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham 
> wrote:
>
> > Hi,
> >
> > Those all succeeded. Does make check now also succeed?
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
> >
> > > Here you are:
> > >
> > > [==] Running 35 tests from 7 test cases.
> > > [--] Global test environment set-up.
> > > [--] 7 tests from HostAllocatorTest/0, where TypeParam = int
> > > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
> > > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > (2861 ms)
> > > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254 ms)
> > > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> > > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> > > (2221 ms)
> > > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> > > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda (3801
> > ms)
> > > [ RUN  ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > > [   OK ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > (2157
> > > ms)
> > > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179 ms)
> > > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> > >
> > > [--] 7 tests from HostAllocatorTest/1, where TypeParam = float
> > > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
> > > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > (2731 ms)
> > > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281 ms)
> > > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> > > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> > > (2164 ms)
> > > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> > > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda (3908
> > ms)
> > > [ RUN  ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > > [   OK ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > (2202
> > > ms)
> > > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> > > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261 ms)
> > > [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
> > >
> > > [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> > > gmx::BasicVector
> > > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> > > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
> > > [ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > [   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> > rAlwaysWorks
> > > (2846 ms)
> > > [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> > > [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283 ms)
> > > [ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> > > [   OK ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> > > (2131 ms)
> > > [ RUN  ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
> > > [   OK ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda (3833
> > ms)
> > > [ RUN  ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> > > [   OK ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> > (2232
> > > ms)
> > > [ RUN  ] HostAllocatorTest/2.StatefulAllocatorUsesMemory
> > > [   OK ] HostAllocatorTest/2.StatefulAllocatorUsesMemory (2164 ms)
> > > [--] 7 tests from HostAllocatorTest/2 (19261 ms total)
> > >
> > > [--] 3 tests from AllocatorTest/0, where TypeParam =
> > > gmx::Allocator
> > > [ RUN  ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment
> > > [   OK ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment (0
> ms)
> > > [ RUN  ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment
> > > [   OK ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment (0
> > ms)
> > > 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
uh, no sir.

>  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.43 sec


On Thu, Feb 8, 2018 at 10:25 AM, Mark Abraham 
wrote:

> Hi,
>
> Those all succeeded. Does make check now also succeed?
>
> Mark
>
> On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:
>
> > Here you are:
> >
> > [==] Running 35 tests from 7 test cases.
> > [--] Global test environment set-up.
> > [--] 7 tests from HostAllocatorTest/0, where TypeParam = int
> > [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> > [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
> > [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > (2861 ms)
> > [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> > [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254 ms)
> > [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> > [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> > (2221 ms)
> > [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> > [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda (3801
> ms)
> > [ RUN  ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> > [   OK ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> (2157
> > ms)
> > [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> > [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179 ms)
> > [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
> >
> > [--] 7 tests from HostAllocatorTest/1, where TypeParam = float
> > [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> > [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
> > [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > (2731 ms)
> > [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> > [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281 ms)
> > [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> > [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> > (2164 ms)
> > [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> > [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda (3908
> ms)
> > [ RUN  ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> > [   OK ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> (2202
> > ms)
> > [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> > [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261 ms)
> > [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
> >
> > [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> > gmx::BasicVector
> > [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> > [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
> > [ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > [   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocato
> rAlwaysWorks
> > (2846 ms)
> > [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> > [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283 ms)
> > [ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> > [   OK ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> > (2131 ms)
> > [ RUN  ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
> > [   OK ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda (3833
> ms)
> > [ RUN  ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> > [   OK ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> (2232
> > ms)
> > [ RUN  ] HostAllocatorTest/2.StatefulAllocatorUsesMemory
> > [   OK ] HostAllocatorTest/2.StatefulAllocatorUsesMemory (2164 ms)
> > [--] 7 tests from HostAllocatorTest/2 (19261 ms total)
> >
> > [--] 3 tests from AllocatorTest/0, where TypeParam =
> > gmx::Allocator
> > [ RUN  ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment
> > [   OK ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment (0 ms)
> > [ RUN  ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment
> > [   OK ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment (0
> ms)
> > [ RUN  ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment
> > [   OK ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment (0
> ms)
> > [--] 3 tests from AllocatorTest/0 (0 ms total)
> >
> > [--] 3 tests from AllocatorTest/1, where TypeParam =
> > gmx::Allocator
> > [ RUN  ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment
> > [   OK ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment (0 ms)
> > [ 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

Those all succeeded. Does make check now also succeed?

Mark

On Thu, Feb 8, 2018 at 6:24 PM Alex  wrote:

> Here you are:
>
> [==] Running 35 tests from 7 test cases.
> [--] Global test environment set-up.
> [--] 7 tests from HostAllocatorTest/0, where TypeParam = int
> [ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
> [   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
> [ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocatorAlwaysWorks
> [   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocatorAlwaysWorks
> (2861 ms)
> [ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
> [   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254 ms)
> [ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> [   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
> (2221 ms)
> [ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
> [   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda (3801 ms)
> [ RUN  ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
> [   OK ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda (2157
> ms)
> [ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
> [   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179 ms)
> [--] 7 tests from HostAllocatorTest/0 (21930 ms total)
>
> [--] 7 tests from HostAllocatorTest/1, where TypeParam = float
> [ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
> [   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
> [ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocatorAlwaysWorks
> [   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocatorAlwaysWorks
> (2731 ms)
> [ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
> [   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281 ms)
> [ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> [   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
> (2164 ms)
> [ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
> [   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda (3908 ms)
> [ RUN  ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
> [   OK ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda (2202
> ms)
> [ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
> [   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261 ms)
> [--] 7 tests from HostAllocatorTest/1 (19287 ms total)
>
> [--] 7 tests from HostAllocatorTest/2, where TypeParam =
> gmx::BasicVector
> [ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
> [   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
> [ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocatorAlwaysWorks
> [   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocatorAlwaysWorks
> (2846 ms)
> [ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
> [   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283 ms)
> [ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> [   OK ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
> (2131 ms)
> [ RUN  ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
> [   OK ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda (3833 ms)
> [ RUN  ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
> [   OK ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda (2232
> ms)
> [ RUN  ] HostAllocatorTest/2.StatefulAllocatorUsesMemory
> [   OK ] HostAllocatorTest/2.StatefulAllocatorUsesMemory (2164 ms)
> [--] 7 tests from HostAllocatorTest/2 (19261 ms total)
>
> [--] 3 tests from AllocatorTest/0, where TypeParam =
> gmx::Allocator
> [ RUN  ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment
> [   OK ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment (0 ms)
> [ RUN  ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment
> [   OK ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment (0 ms)
> [ RUN  ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment
> [   OK ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment (0 ms)
> [--] 3 tests from AllocatorTest/0 (0 ms total)
>
> [--] 3 tests from AllocatorTest/1, where TypeParam =
> gmx::Allocator
> [ RUN  ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment
> [   OK ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment (0 ms)
> [ RUN  ] AllocatorTest/1.VectorAllocatesAndResizesWithAlignment
> [   OK ] AllocatorTest/1.VectorAllocatesAndResizesWithAlignment (0 ms)
> [ RUN  ] AllocatorTest/1.VectorAllocatesAndReservesWithAlignment
> [   OK ] AllocatorTest/1.VectorAllocatesAndReservesWithAlignment (0 ms)
> [--] 3 tests from AllocatorTest/1 (0 ms total)
>
> [--] 3 tests 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
Here you are:

[==] Running 35 tests from 7 test cases.
[--] Global test environment set-up.
[--] 7 tests from HostAllocatorTest/0, where TypeParam = int
[ RUN  ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks
[   OK ] HostAllocatorTest/0.EmptyMemoryAlwaysWorks (5457 ms)
[ RUN  ] HostAllocatorTest/0.VectorsWithDefaultHostAllocatorAlwaysWorks
[   OK ] HostAllocatorTest/0.VectorsWithDefaultHostAllocatorAlwaysWorks
(2861 ms)
[ RUN  ] HostAllocatorTest/0.TransfersWithoutPinningWork
[   OK ] HostAllocatorTest/0.TransfersWithoutPinningWork (3254 ms)
[ RUN  ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
[   OK ] HostAllocatorTest/0.FillInputAlsoWorksAfterCallingReserve
(2221 ms)
[ RUN  ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda
[   OK ] HostAllocatorTest/0.TransfersWithPinningWorkWithCuda (3801 ms)
[ RUN  ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda
[   OK ] HostAllocatorTest/0.ManualPinningOperationsWorkWithCuda (2157
ms)
[ RUN  ] HostAllocatorTest/0.StatefulAllocatorUsesMemory
[   OK ] HostAllocatorTest/0.StatefulAllocatorUsesMemory (2179 ms)
[--] 7 tests from HostAllocatorTest/0 (21930 ms total)

[--] 7 tests from HostAllocatorTest/1, where TypeParam = float
[ RUN  ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks
[   OK ] HostAllocatorTest/1.EmptyMemoryAlwaysWorks (2739 ms)
[ RUN  ] HostAllocatorTest/1.VectorsWithDefaultHostAllocatorAlwaysWorks
[   OK ] HostAllocatorTest/1.VectorsWithDefaultHostAllocatorAlwaysWorks
(2731 ms)
[ RUN  ] HostAllocatorTest/1.TransfersWithoutPinningWork
[   OK ] HostAllocatorTest/1.TransfersWithoutPinningWork (3281 ms)
[ RUN  ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
[   OK ] HostAllocatorTest/1.FillInputAlsoWorksAfterCallingReserve
(2164 ms)
[ RUN  ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda
[   OK ] HostAllocatorTest/1.TransfersWithPinningWorkWithCuda (3908 ms)
[ RUN  ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda
[   OK ] HostAllocatorTest/1.ManualPinningOperationsWorkWithCuda (2202
ms)
[ RUN  ] HostAllocatorTest/1.StatefulAllocatorUsesMemory
[   OK ] HostAllocatorTest/1.StatefulAllocatorUsesMemory (2261 ms)
[--] 7 tests from HostAllocatorTest/1 (19287 ms total)

[--] 7 tests from HostAllocatorTest/2, where TypeParam =
gmx::BasicVector
[ RUN  ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks
[   OK ] HostAllocatorTest/2.EmptyMemoryAlwaysWorks (2771 ms)
[ RUN  ] HostAllocatorTest/2.VectorsWithDefaultHostAllocatorAlwaysWorks
[   OK ] HostAllocatorTest/2.VectorsWithDefaultHostAllocatorAlwaysWorks
(2846 ms)
[ RUN  ] HostAllocatorTest/2.TransfersWithoutPinningWork
[   OK ] HostAllocatorTest/2.TransfersWithoutPinningWork (3283 ms)
[ RUN  ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
[   OK ] HostAllocatorTest/2.FillInputAlsoWorksAfterCallingReserve
(2131 ms)
[ RUN  ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda
[   OK ] HostAllocatorTest/2.TransfersWithPinningWorkWithCuda (3833 ms)
[ RUN  ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda
[   OK ] HostAllocatorTest/2.ManualPinningOperationsWorkWithCuda (2232
ms)
[ RUN  ] HostAllocatorTest/2.StatefulAllocatorUsesMemory
[   OK ] HostAllocatorTest/2.StatefulAllocatorUsesMemory (2164 ms)
[--] 7 tests from HostAllocatorTest/2 (19261 ms total)

[--] 3 tests from AllocatorTest/0, where TypeParam =
gmx::Allocator
[ RUN  ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment
[   OK ] AllocatorTest/0.AllocatorAlignAllocatesWithAlignment (0 ms)
[ RUN  ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment
[   OK ] AllocatorTest/0.VectorAllocatesAndResizesWithAlignment (0 ms)
[ RUN  ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment
[   OK ] AllocatorTest/0.VectorAllocatesAndReservesWithAlignment (0 ms)
[--] 3 tests from AllocatorTest/0 (0 ms total)

[--] 3 tests from AllocatorTest/1, where TypeParam =
gmx::Allocator
[ RUN  ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment
[   OK ] AllocatorTest/1.AllocatorAlignAllocatesWithAlignment (0 ms)
[ RUN  ] AllocatorTest/1.VectorAllocatesAndResizesWithAlignment
[   OK ] AllocatorTest/1.VectorAllocatesAndResizesWithAlignment (0 ms)
[ RUN  ] AllocatorTest/1.VectorAllocatesAndReservesWithAlignment
[   OK ] AllocatorTest/1.VectorAllocatesAndReservesWithAlignment (0 ms)
[--] 3 tests from AllocatorTest/1 (0 ms total)

[--] 3 tests from AllocatorTest/2, where TypeParam =
gmx::Allocator
[ RUN  ] AllocatorTest/2.AllocatorAlignAllocatesWithAlignment
[   OK ] AllocatorTest/2.AllocatorAlignAllocatesWithAlignment (0 ms)
[ RUN  ] 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Szilárd Páll
It might help to know which of the unit test(s) in that group stall? Can
you run it manually (bin/gpu_utils-test) and report back the standard
output?


--
Szilárd

On Thu, Feb 8, 2018 at 3:56 PM, Alex  wrote:

> Nope, still persists after reboot and no other jobs running:
>  9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.59 sec
>
> Any additional suggestions?
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support
> /Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
Here's some additional info:


#

# cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX x86_64 Kernel Module  390.12  Wed Dec 20 07:19:16
PST 2017

GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.6)



#

# /usr/local/cuda/bin/nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2017 NVIDIA Corporation

Built on Fri_Nov__3_21:07:56_CDT_2017

Cuda compilation tools, release 9.1, V9.1.85



#

# /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery

/usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery Starting...

.

.(lots of output specific to each of our 3 devices)

.

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime
Version = 9.1, NumDevs = 3

Result = PASS



#

# /usr/local/cuda/samples/bin/x86_64/linux/release/bandwidthTest

[CUDA Bandwidth Test] - Starting...

Running on...



Device 0: TITAN Xp

Quick Mode



Host to Device Bandwidth, 1 Device(s)

PINNED Memory Transfers

   Transfer Size (Bytes)Bandwidth(MB/s)

   33554432 11350.1



Device to Host Bandwidth, 1 Device(s)

PINNED Memory Transfers

   Transfer Size (Bytes)Bandwidth(MB/s)

   33554432 12860.4



Device to Device Bandwidth, 1 Device(s)

PINNED Memory Transfers

   Transfer Size (Bytes)Bandwidth(MB/s)

   33554432 417429.3



Result = PASS



NOTE: The CUDA Samples are not meant for performance measurements. Results
may vary when GPU Boost is enabled.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
Forwarding colleague's email below, any suggestions highly appreciated.

Thanks!

Alex

***

I ran the minimal tests suggested in the cuda installation guide.
(bandwidthTest,
deviceQuery) and then I individually ran 10 of the samples provided.
However, many of the samples require a graphics interface and they simply
don’t execute from the command line.  If the Gromacs people have a
suggestion for how to do a complete test, I would like to hear it.  I
followed the installation guide found here
http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html including
its test suggestions.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
I did hear yesterday that CUDA's own tests passed, but will update on 
that in more detail as soon as people start showing up -- it's 8 am 
right now... :)


Thanks Mark,

Alex


On 2/8/2018 7:59 AM, Mark Abraham wrote:

Hi,

OK, but not clear to me if followed the other advice - cleaned out all the
NVIDIA stuff (CUDA, runtime, drivers), nor if CUDA's own tests work.

Mark

On Thu, Feb 8, 2018 at 3:57 PM Alex  wrote:


Nope, still persists after reboot and no other jobs running:
   9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.59 sec

Any additional suggestions?
--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.


--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

OK, but not clear to me if followed the other advice - cleaned out all the
NVIDIA stuff (CUDA, runtime, drivers), nor if CUDA's own tests work.

Mark

On Thu, Feb 8, 2018 at 3:57 PM Alex  wrote:

> Nope, still persists after reboot and no other jobs running:
>   9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.59 sec
>
> Any additional suggestions?
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex

Nope, still persists after reboot and no other jobs running:
 9/39 Test  #9: GpuUtilsUnitTests ***Timeout  30.59 sec

Any additional suggestions?
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex
I am rebooting the box and kicking out all the jobs until we figure this 
out.


Thanks!

Alex


On 2/8/2018 7:27 AM, Szilárd Páll wrote:

BTW, timeouts can be caused by contention from stupid number of ranks/tMPI
threads hammering a single GPU (especially with 2 threads/core with HT),
but I'm not sure if the tests are ever executed with such a huge rank count.

--
Szilárd

On Thu, Feb 8, 2018 at 2:40 PM, Mark Abraham 
wrote:


Hi,

On Thu, Feb 8, 2018 at 2:15 PM Alex  wrote:


Mark and Peter,

Thanks for commenting. I was told that all CUDA tests passed, but I will
double check on how many of those were actually run. Also, we never
rebooted the box after CUDA install, and finally we had a bunch of
gromacs (2016.4) jobs running, because we didn't want to interrupt
postdoc's work... All of those were with -nb cpu though. Could those
factors have affected our regression tests?


Can't say. You observed timeouts, which could be consistent with drivers or
runtimes getting stuck. However, the other mdrun processes may have by
default set thread affinity, and any process that does that will interfere
with how effectively any others run, such as the tests. Sharing a node is
difficult to do well, and doing anything else with a node running GROMACS
is asking for trouble unless you have manually managed keeping the tasks
apart. Just don't.

Mark



It will really suck, if these are hardware-related...

Thanks,

Alex


On 2/8/2018 3:03 AM, Mark Abraham wrote:

Hi,

Or leftovers of the drivers that are now mismatching. That has caused
timeouts for us.

Mark

On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:


Hi,


with changing failures like this I would start to suspect the hardware
as well. Mark's suggestion of looking at simpler test programs than

GMX

is a good one :)


Peter


On 08-02-18 09 <08-02%2018%2009> <08-02%2018%2009>:10, Mark Abraham

wrote:

Hi,

That suggests that your new CUDA installation is differently

incomplete.

Do

its samples or test programs run?

Mark

On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:


Update: we seem to have had a hiccup with an orphan CUDA install and

that

was causing issues. After wiping everything off and rebuilding the

errors

from the initial post disappeared. However, two tests failed during
regression:

95% tests passed, 2 tests failed out of 39

Label Time Summary:
GTest  = 170.83 sec (33 tests)
IntegrationTest= 125.00 sec (3 tests)
MpiTest=   4.90 sec (3 tests)
UnitTest   =  45.83 sec (30 tests)

Total Test time (real) = 1225.65 sec

The following tests FAILED:
9 - GpuUtilsUnitTests (Timeout)
32 - MdrunTests (Timeout)
Errors while running CTest
CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
'CMakeFiles/run-ctest-nophys' failed
make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
CMakeFiles/Makefile2:1160: recipe for target
'CMakeFiles/run-ctest-nophys.dir/all' failed
make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
CMakeFiles/Makefile2:971: recipe for target

'CMakeFiles/check.dir/rule'

failed
make[1]: *** [CMakeFiles/check.dir/rule] Error 2
Makefile:546: recipe for target 'check' failed
make: *** [check] Error 2

Any ideas? I can post the complete log, if needed.

Thank you,

Alex
--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users

or

send a mail to gmx-users-requ...@gromacs.org.


--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.

--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.


--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/
Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.



--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex

Mark, Peter -- thanks. Your comments make sense.
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Szilárd Páll
BTW, timeouts can be caused by contention from stupid number of ranks/tMPI
threads hammering a single GPU (especially with 2 threads/core with HT),
but I'm not sure if the tests are ever executed with such a huge rank count.

--
Szilárd

On Thu, Feb 8, 2018 at 2:40 PM, Mark Abraham 
wrote:

> Hi,
>
> On Thu, Feb 8, 2018 at 2:15 PM Alex  wrote:
>
> > Mark and Peter,
> >
> > Thanks for commenting. I was told that all CUDA tests passed, but I will
> > double check on how many of those were actually run. Also, we never
> > rebooted the box after CUDA install, and finally we had a bunch of
> > gromacs (2016.4) jobs running, because we didn't want to interrupt
> > postdoc's work... All of those were with -nb cpu though. Could those
> > factors have affected our regression tests?
> >
>
> Can't say. You observed timeouts, which could be consistent with drivers or
> runtimes getting stuck. However, the other mdrun processes may have by
> default set thread affinity, and any process that does that will interfere
> with how effectively any others run, such as the tests. Sharing a node is
> difficult to do well, and doing anything else with a node running GROMACS
> is asking for trouble unless you have manually managed keeping the tasks
> apart. Just don't.
>
> Mark
>
>
> > It will really suck, if these are hardware-related...
> >
> > Thanks,
> >
> > Alex
> >
> >
> > On 2/8/2018 3:03 AM, Mark Abraham wrote:
> > > Hi,
> > >
> > > Or leftovers of the drivers that are now mismatching. That has caused
> > > timeouts for us.
> > >
> > > Mark
> > >
> > > On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:
> > >
> > >> Hi,
> > >>
> > >>
> > >> with changing failures like this I would start to suspect the hardware
> > >> as well. Mark's suggestion of looking at simpler test programs than
> GMX
> > >> is a good one :)
> > >>
> > >>
> > >> Peter
> > >>
> > >>
> > >> On 08-02-18 09 <08-02%2018%2009> <08-02%2018%2009>:10, Mark Abraham
> > wrote:
> > >>> Hi,
> > >>>
> > >>> That suggests that your new CUDA installation is differently
> > incomplete.
> > >> Do
> > >>> its samples or test programs run?
> > >>>
> > >>> Mark
> > >>>
> > >>> On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:
> > >>>
> >  Update: we seem to have had a hiccup with an orphan CUDA install and
> > >> that
> >  was causing issues. After wiping everything off and rebuilding the
> > >> errors
> >  from the initial post disappeared. However, two tests failed during
> >  regression:
> > 
> >  95% tests passed, 2 tests failed out of 39
> > 
> >  Label Time Summary:
> >  GTest  = 170.83 sec (33 tests)
> >  IntegrationTest= 125.00 sec (3 tests)
> >  MpiTest=   4.90 sec (3 tests)
> >  UnitTest   =  45.83 sec (30 tests)
> > 
> >  Total Test time (real) = 1225.65 sec
> > 
> >  The following tests FAILED:
> > 9 - GpuUtilsUnitTests (Timeout)
> >  32 - MdrunTests (Timeout)
> >  Errors while running CTest
> >  CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
> >  'CMakeFiles/run-ctest-nophys' failed
> >  make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
> >  CMakeFiles/Makefile2:1160: recipe for target
> >  'CMakeFiles/run-ctest-nophys.dir/all' failed
> >  make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
> >  CMakeFiles/Makefile2:971: recipe for target
> > 'CMakeFiles/check.dir/rule'
> >  failed
> >  make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> >  Makefile:546: recipe for target 'check' failed
> >  make: *** [check] Error 2
> > 
> >  Any ideas? I can post the complete log, if needed.
> > 
> >  Thank you,
> > 
> >  Alex
> >  --
> >  Gromacs Users mailing list
> > 
> >  * Please search the archive at
> >  http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >  posting!
> > 
> >  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > 
> >  * For (un)subscribe requests visit
> >  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> >  send a mail to gmx-users-requ...@gromacs.org.
> > 
> > >>
> > >> --
> > >> Gromacs Users mailing list
> > >>
> > >> * Please search the archive at
> > >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > >> posting!
> > >>
> > >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >>
> > >> * For (un)subscribe requests visit
> > >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > >> send a mail to gmx-users-requ...@gromacs.org.
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe 

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

On Thu, Feb 8, 2018 at 2:15 PM Alex  wrote:

> Mark and Peter,
>
> Thanks for commenting. I was told that all CUDA tests passed, but I will
> double check on how many of those were actually run. Also, we never
> rebooted the box after CUDA install, and finally we had a bunch of
> gromacs (2016.4) jobs running, because we didn't want to interrupt
> postdoc's work... All of those were with -nb cpu though. Could those
> factors have affected our regression tests?
>

Can't say. You observed timeouts, which could be consistent with drivers or
runtimes getting stuck. However, the other mdrun processes may have by
default set thread affinity, and any process that does that will interfere
with how effectively any others run, such as the tests. Sharing a node is
difficult to do well, and doing anything else with a node running GROMACS
is asking for trouble unless you have manually managed keeping the tasks
apart. Just don't.

Mark


> It will really suck, if these are hardware-related...
>
> Thanks,
>
> Alex
>
>
> On 2/8/2018 3:03 AM, Mark Abraham wrote:
> > Hi,
> >
> > Or leftovers of the drivers that are now mismatching. That has caused
> > timeouts for us.
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:
> >
> >> Hi,
> >>
> >>
> >> with changing failures like this I would start to suspect the hardware
> >> as well. Mark's suggestion of looking at simpler test programs than GMX
> >> is a good one :)
> >>
> >>
> >> Peter
> >>
> >>
> >> On 08-02-18 09 <08-02%2018%2009> <08-02%2018%2009>:10, Mark Abraham
> wrote:
> >>> Hi,
> >>>
> >>> That suggests that your new CUDA installation is differently
> incomplete.
> >> Do
> >>> its samples or test programs run?
> >>>
> >>> Mark
> >>>
> >>> On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:
> >>>
>  Update: we seem to have had a hiccup with an orphan CUDA install and
> >> that
>  was causing issues. After wiping everything off and rebuilding the
> >> errors
>  from the initial post disappeared. However, two tests failed during
>  regression:
> 
>  95% tests passed, 2 tests failed out of 39
> 
>  Label Time Summary:
>  GTest  = 170.83 sec (33 tests)
>  IntegrationTest= 125.00 sec (3 tests)
>  MpiTest=   4.90 sec (3 tests)
>  UnitTest   =  45.83 sec (30 tests)
> 
>  Total Test time (real) = 1225.65 sec
> 
>  The following tests FAILED:
> 9 - GpuUtilsUnitTests (Timeout)
>  32 - MdrunTests (Timeout)
>  Errors while running CTest
>  CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
>  'CMakeFiles/run-ctest-nophys' failed
>  make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
>  CMakeFiles/Makefile2:1160: recipe for target
>  'CMakeFiles/run-ctest-nophys.dir/all' failed
>  make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
>  CMakeFiles/Makefile2:971: recipe for target
> 'CMakeFiles/check.dir/rule'
>  failed
>  make[1]: *** [CMakeFiles/check.dir/rule] Error 2
>  Makefile:546: recipe for target 'check' failed
>  make: *** [check] Error 2
> 
>  Any ideas? I can post the complete log, if needed.
> 
>  Thank you,
> 
>  Alex
>  --
>  Gromacs Users mailing list
> 
>  * Please search the archive at
>  http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>  posting!
> 
>  * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
>  * For (un)subscribe requests visit
>  https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>  send a mail to gmx-users-requ...@gromacs.org.
> 
> >>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Peter Kroon
Jup, start with rebooting before trying anything else. There's probably
still old drivers loaded in the kernel.


Peter


On 08-02-18 14:14, Alex wrote:
> Mark and Peter,
>
> Thanks for commenting. I was told that all CUDA tests passed, but I
> will double check on how many of those were actually run. Also, we
> never rebooted the box after CUDA install, and finally we had a bunch
> of gromacs (2016.4) jobs running, because we didn't want to interrupt
> postdoc's work... All of those were with -nb cpu though. Could those
> factors have affected our regression tests?
>
> It will really suck, if these are hardware-related...
>
> Thanks,
>
> Alex
>
>
> On 2/8/2018 3:03 AM, Mark Abraham wrote:
>> Hi,
>>
>> Or leftovers of the drivers that are now mismatching. That has caused
>> timeouts for us.
>>
>> Mark
>>
>> On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:
>>
>>> Hi,
>>>
>>>
>>> with changing failures like this I would start to suspect the hardware
>>> as well. Mark's suggestion of looking at simpler test programs than GMX
>>> is a good one :)
>>>
>>>
>>> Peter
>>>
>>>
>>> On 08-02-18 09 <08-02%2018%2009>:10, Mark Abraham wrote:
 Hi,

 That suggests that your new CUDA installation is differently
 incomplete.
>>> Do
 its samples or test programs run?

 Mark

 On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:

> Update: we seem to have had a hiccup with an orphan CUDA install and
>>> that
> was causing issues. After wiping everything off and rebuilding the
>>> errors
> from the initial post disappeared. However, two tests failed during
> regression:
>
> 95% tests passed, 2 tests failed out of 39
>
> Label Time Summary:
> GTest  = 170.83 sec (33 tests)
> IntegrationTest    = 125.00 sec (3 tests)
> MpiTest    =   4.90 sec (3 tests)
> UnitTest   =  45.83 sec (30 tests)
>
> Total Test time (real) = 1225.65 sec
>
> The following tests FAILED:
>    9 - GpuUtilsUnitTests (Timeout)
> 32 - MdrunTests (Timeout)
> Errors while running CTest
> CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
> 'CMakeFiles/run-ctest-nophys' failed
> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
> CMakeFiles/Makefile2:1160: recipe for target
> 'CMakeFiles/run-ctest-nophys.dir/all' failed
> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
> CMakeFiles/Makefile2:971: recipe for target
> 'CMakeFiles/check.dir/rule'
> failed
> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> Makefile:546: recipe for target 'check' failed
> make: *** [check] Error 2
>
> Any ideas? I can post the complete log, if needed.
>
> Thank you,
>
> Alex
> -- 
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
>>>
>>> -- 
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>> posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-requ...@gromacs.org.
>


-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Alex

Mark and Peter,

Thanks for commenting. I was told that all CUDA tests passed, but I will 
double check on how many of those were actually run. Also, we never 
rebooted the box after CUDA install, and finally we had a bunch of 
gromacs (2016.4) jobs running, because we didn't want to interrupt 
postdoc's work... All of those were with -nb cpu though. Could those 
factors have affected our regression tests?


It will really suck, if these are hardware-related...

Thanks,

Alex


On 2/8/2018 3:03 AM, Mark Abraham wrote:

Hi,

Or leftovers of the drivers that are now mismatching. That has caused
timeouts for us.

Mark

On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:


Hi,


with changing failures like this I would start to suspect the hardware
as well. Mark's suggestion of looking at simpler test programs than GMX
is a good one :)


Peter


On 08-02-18 09 <08-02%2018%2009>:10, Mark Abraham wrote:

Hi,

That suggests that your new CUDA installation is differently incomplete.

Do

its samples or test programs run?

Mark

On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:


Update: we seem to have had a hiccup with an orphan CUDA install and

that

was causing issues. After wiping everything off and rebuilding the

errors

from the initial post disappeared. However, two tests failed during
regression:

95% tests passed, 2 tests failed out of 39

Label Time Summary:
GTest  = 170.83 sec (33 tests)
IntegrationTest= 125.00 sec (3 tests)
MpiTest=   4.90 sec (3 tests)
UnitTest   =  45.83 sec (30 tests)

Total Test time (real) = 1225.65 sec

The following tests FAILED:
   9 - GpuUtilsUnitTests (Timeout)
32 - MdrunTests (Timeout)
Errors while running CTest
CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
'CMakeFiles/run-ctest-nophys' failed
make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
CMakeFiles/Makefile2:1160: recipe for target
'CMakeFiles/run-ctest-nophys.dir/all' failed
make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
CMakeFiles/Makefile2:971: recipe for target 'CMakeFiles/check.dir/rule'
failed
make[1]: *** [CMakeFiles/check.dir/rule] Error 2
Makefile:546: recipe for target 'check' failed
make: *** [check] Error 2

Any ideas? I can post the complete log, if needed.

Thank you,

Alex
--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.



--
Gromacs Users mailing list

* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-requ...@gromacs.org.


--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Mark Abraham
Hi,

Or leftovers of the drivers that are now mismatching. That has caused
timeouts for us.

Mark

On Thu, Feb 8, 2018 at 10:55 AM Peter Kroon  wrote:

> Hi,
>
>
> with changing failures like this I would start to suspect the hardware
> as well. Mark's suggestion of looking at simpler test programs than GMX
> is a good one :)
>
>
> Peter
>
>
> On 08-02-18 09 <08-02%2018%2009>:10, Mark Abraham wrote:
> > Hi,
> >
> > That suggests that your new CUDA installation is differently incomplete.
> Do
> > its samples or test programs run?
> >
> > Mark
> >
> > On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:
> >
> >> Update: we seem to have had a hiccup with an orphan CUDA install and
> that
> >> was causing issues. After wiping everything off and rebuilding the
> errors
> >> from the initial post disappeared. However, two tests failed during
> >> regression:
> >>
> >> 95% tests passed, 2 tests failed out of 39
> >>
> >> Label Time Summary:
> >> GTest  = 170.83 sec (33 tests)
> >> IntegrationTest= 125.00 sec (3 tests)
> >> MpiTest=   4.90 sec (3 tests)
> >> UnitTest   =  45.83 sec (30 tests)
> >>
> >> Total Test time (real) = 1225.65 sec
> >>
> >> The following tests FAILED:
> >>   9 - GpuUtilsUnitTests (Timeout)
> >> 32 - MdrunTests (Timeout)
> >> Errors while running CTest
> >> CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
> >> 'CMakeFiles/run-ctest-nophys' failed
> >> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
> >> CMakeFiles/Makefile2:1160: recipe for target
> >> 'CMakeFiles/run-ctest-nophys.dir/all' failed
> >> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
> >> CMakeFiles/Makefile2:971: recipe for target 'CMakeFiles/check.dir/rule'
> >> failed
> >> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> >> Makefile:546: recipe for target 'check' failed
> >> make: *** [check] Error 2
> >>
> >> Any ideas? I can post the complete log, if needed.
> >>
> >> Thank you,
> >>
> >> Alex
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-requ...@gromacs.org.
> >>
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-08 Thread Peter Kroon
Hi,


with changing failures like this I would start to suspect the hardware
as well. Mark's suggestion of looking at simpler test programs than GMX
is a good one :)


Peter


On 08-02-18 09:10, Mark Abraham wrote:
> Hi,
>
> That suggests that your new CUDA installation is differently incomplete. Do
> its samples or test programs run?
>
> Mark
>
> On Thu, Feb 8, 2018 at 1:20 AM Alex  wrote:
>
>> Update: we seem to have had a hiccup with an orphan CUDA install and that
>> was causing issues. After wiping everything off and rebuilding the errors
>> from the initial post disappeared. However, two tests failed during
>> regression:
>>
>> 95% tests passed, 2 tests failed out of 39
>>
>> Label Time Summary:
>> GTest  = 170.83 sec (33 tests)
>> IntegrationTest= 125.00 sec (3 tests)
>> MpiTest=   4.90 sec (3 tests)
>> UnitTest   =  45.83 sec (30 tests)
>>
>> Total Test time (real) = 1225.65 sec
>>
>> The following tests FAILED:
>>   9 - GpuUtilsUnitTests (Timeout)
>> 32 - MdrunTests (Timeout)
>> Errors while running CTest
>> CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
>> 'CMakeFiles/run-ctest-nophys' failed
>> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
>> CMakeFiles/Makefile2:1160: recipe for target
>> 'CMakeFiles/run-ctest-nophys.dir/all' failed
>> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
>> CMakeFiles/Makefile2:971: recipe for target 'CMakeFiles/check.dir/rule'
>> failed
>> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
>> Makefile:546: recipe for target 'check' failed
>> make: *** [check] Error 2
>>
>> Any ideas? I can post the complete log, if needed.
>>
>> Thank you,
>>
>> Alex
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-requ...@gromacs.org.
>>


-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-07 Thread Alex
Update: we seem to have had a hiccup with an orphan CUDA install and that
was causing issues. After wiping everything off and rebuilding the errors
from the initial post disappeared. However, two tests failed during
regression:

95% tests passed, 2 tests failed out of 39

Label Time Summary:
GTest  = 170.83 sec (33 tests)
IntegrationTest= 125.00 sec (3 tests)
MpiTest=   4.90 sec (3 tests)
UnitTest   =  45.83 sec (30 tests)

Total Test time (real) = 1225.65 sec

The following tests FAILED:
  9 - GpuUtilsUnitTests (Timeout)
32 - MdrunTests (Timeout)
Errors while running CTest
CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
'CMakeFiles/run-ctest-nophys' failed
make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
CMakeFiles/Makefile2:1160: recipe for target
'CMakeFiles/run-ctest-nophys.dir/all' failed
make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
CMakeFiles/Makefile2:971: recipe for target 'CMakeFiles/check.dir/rule'
failed
make[1]: *** [CMakeFiles/check.dir/rule] Error 2
Makefile:546: recipe for target 'check' failed
make: *** [check] Error 2

Any ideas? I can post the complete log, if needed.

Thank you,

Alex
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-07 Thread Alex

Hi Mark,

Nothing has been installed yet, so the commands were issued from 
/build/bin and so I am not sure about the output of that mdrun-test (let 
me know what exact command could make it more informative).


Thank you,

Alex

***

> ./gmx -version

GROMACS version:    2018
Precision:  single
Memory model:   64 bit
MPI library:    thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support:    CUDA
SIMD instructions:  AVX2_256
FFT library:    fftw-3.3.5-fma-sse2-avx-avx2-avx2_128-avx512
RDTSCP usage:   enabled
TNG support:    enabled
Hwloc support:  hwloc-1.11.0
Tracing support:    disabled
Built on:   2018-02-06 19:30:36
Built by:   smolyan@647trc-md1 [CMAKE]
Build OS/arch:  Linux 4.4.0-112-generic x86_64
Build CPU vendor:   Intel
Build CPU brand:    Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Build CPU family:   6   Model: 79   Stepping: 1
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle 
htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse 
rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic

C compiler: /usr/bin/cc GNU 5.4.0
C compiler flags:    -march=core-avx2 -O3 -DNDEBUG 
-funroll-all-loops -fexcess-precision=fast

C++ compiler:   /usr/bin/c++ GNU 5.4.0
C++ compiler flags:  -march=core-avx2    -std=c++11   -O3 -DNDEBUG 
-funroll-all-loops -fexcess-precision=fast
CUDA compiler:  /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda 
compiler driver;Copyright (c) 2005-2017 NVIDIA Corporation;Built on 
Fri_Nov__3_21:07:56_CDT_2017;Cuda compilation tools, release 9.1, V9.1.85
CUDA compiler 
flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70;-use_fast_math;-D_FORCE_INLINES;; 
;-march=core-avx2;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;

CUDA driver:    9.10
CUDA runtime:   9.10

> ldd -r ./mdrun-test
    linux-vdso.so.1 =>  (0x7ffcfcc3e000)
    libgromacs.so.3 => 
/home/smolyan/scratch/gmx2018_install_temp/gromacs-2018/build/bin/./../lib/libgromacs.so.3 
(0x7faa58f8f000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x7faa58d72000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 
(0x7faa589f)

    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7faa586e7000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 
(0x7faa584d1000)

    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7faa58107000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7faa57f03000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x7faa57cfb000)
    libcufft.so.9.1 => /usr/local/cuda/lib64/libcufft.so.9.1 
(0x7faa5080e000)
    libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 
(0x7faa505d4000)
    libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 
(0x7faa503b2000)

    /lib64/ld-linux-x86-64.so.2 (0x7faa5c1ad000)
    libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 
(0x7faa501a7000)
    libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 
(0x7faa4ff9d000)



On 2/7/2018 5:13 AM, Mark Abraham wrote:

Hi,

I checked back with the CUDA-facing GROMACS developers. They've run the
code with 9.1 and believe there's no intrinsic problem within GROMACS.


So I don't have much to suggest other then rebuilding everything cleanly,

as this is an internal non-descript cuFFT/driver error that is not supposed
to happen,
especially in mdrun-test with its single input system, and it will prevent
him from using -pme gpu.

The only thing PME could do better is to show more meaningful error

messages (which would have to be hardcoded anyway as cuFFT doesn't even
have human readable strings for error codes).

If you could share the output of
* gmx -version
* ldd -r mdrun-test
then perhaps we can find an issue (or at least report to nvidia usefully).
Ensuring you are using the CUDA driver that came with the CUDA runtime is
most likely to work smoothly.

Mark

On Tue, Feb 6, 2018 at 9:24 PM Alex  wrote:


And this is with:

gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609 <020-16%2006%2009>



On Tue, Feb 6, 2018 at 1:18 PM, Alex  wrote:


Hi all,

I've just built the latest version and regression tests are running. Here
is one error:

"Program: mdrun-test, version 2018
Source file: src/gromacs/ewald/pme-3dfft.cu (line 56)

Fatal error:
cufftPlanMany R2C plan failure (error code 5)"

This is with CUDA 9.1.

Anything to worry about?

Thank you,

Alex


--
Gromacs Users mailing list

* Please search the archive at

Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-07 Thread Mark Abraham
Hi,

I checked back with the CUDA-facing GROMACS developers. They've run the
code with 9.1 and believe there's no intrinsic problem within GROMACS.

> So I don't have much to suggest other then rebuilding everything cleanly,
as this is an internal non-descript cuFFT/driver error that is not supposed
to happen,
especially in mdrun-test with its single input system, and it will prevent
him from using -pme gpu.
> The only thing PME could do better is to show more meaningful error
messages (which would have to be hardcoded anyway as cuFFT doesn't even
have human readable strings for error codes).

If you could share the output of
* gmx -version
* ldd -r mdrun-test
then perhaps we can find an issue (or at least report to nvidia usefully).
Ensuring you are using the CUDA driver that came with the CUDA runtime is
most likely to work smoothly.

Mark

On Tue, Feb 6, 2018 at 9:24 PM Alex  wrote:

> And this is with:
> > gcc --version
> > gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609 <020-16%2006%2009>
>
>
>
> On Tue, Feb 6, 2018 at 1:18 PM, Alex  wrote:
>
> > Hi all,
> >
> > I've just built the latest version and regression tests are running. Here
> > is one error:
> >
> > "Program: mdrun-test, version 2018
> > Source file: src/gromacs/ewald/pme-3dfft.cu (line 56)
> >
> > Fatal error:
> > cufftPlanMany R2C plan failure (error code 5)"
> >
> > This is with CUDA 9.1.
> >
> > Anything to worry about?
> >
> > Thank you,
> >
> > Alex
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] GMX 2018 regression tests: cufftPlanMany R2C plan failure (error code 5)

2018-02-06 Thread Alex
And this is with:
> gcc --version
> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609



On Tue, Feb 6, 2018 at 1:18 PM, Alex  wrote:

> Hi all,
>
> I've just built the latest version and regression tests are running. Here
> is one error:
>
> "Program: mdrun-test, version 2018
> Source file: src/gromacs/ewald/pme-3dfft.cu (line 56)
>
> Fatal error:
> cufftPlanMany R2C plan failure (error code 5)"
>
> This is with CUDA 9.1.
>
> Anything to worry about?
>
> Thank you,
>
> Alex
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.