As a general note: Once you sort out the resources issue, it is *very* 
important to retune your block and grid sizes after switching from compute 
capability 2.0 (Tesla C2075) to compute capability 3.x (Tesla K40c).  When 
first switched my code to the new architecture, I saw almost no improvement or 
actual regressions in performance.  It wasn't until I re-benchmarked different 
grid configurations that I discovered the problem.

In fact, I now sometimes include an auto-tuning stage in my CUDA programs to 
dynamically select from a range of reasonable block sizes based on runtime 
benchmarks of my important kernels.

On Apr 2, 2014, at 1:46 AM, Jerome Kieffer <[email protected]> wrote:

> On Wed, 2 Apr 2014 17:41:59 +1300
> Alistair McDougall <[email protected]> wrote:
> 
>> Hi,
>>   I'm have previously been using PyCUDA on a Tesla C2075 as part my of
>> astrophysics research. We recently installed a Tesla K40c and I was hoping
>> to just run the same code on the new card, however I am receiving "pycuda
>> ._driver.LaunchError: cuLaunchKernel failed: launch out of resources"
>> errors.
>> 
>> A quick google search for "PyCUDA Tesla K40c" returned a minimal set of
>> results, which led me to wonder has anyone tried running PyCUDA on this
>> card?
> 
> Hi,
> I ran into similar bugs with our K20 and I was
> scratching my head for a while when people from Nvidia told me that the
> driver 319 from nvidia had problems with the GK110 based Tesla cards.
> Driver 331 runs without glitches for a while now.
> 
> Hope this helps.
> 
> 
> -- 
> Jérôme Kieffer
> tel +33 476 882 445
> 
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to