Received from Ahmed Fasih on Wed, Nov 07, 2012 at 11:07:31PM EST:

(snip)

> Thanks Lev! These gists were really useful in understanding how to use
> these functions, and they work for me too. Nonetheless, I tried and
> succeeded in breaking the second one: see
> https://gist.github.com/4036693
> 
> First, I had to add "assert" in the calls to np.allclose to make sure
> I'd be informed if things weren't all close. Then I extended the
> kernel to work with multiple blocks, and finally I moved the unpinned
> test first. As I increased N from 20 to 22, both tests passed. But at
> N=23 (23 by 23 array), although the unpinned version works, the pinned
> assertion fails and PyCUDA complains that cleanup operations failed.
> 
> I can't find any documented limit on the size of page-locked memory
> allocations, but it ought to be >3kb, right?

I'm not aware of any such limits.

> Ubuntu 11.10, NVIDIA driver 304.51, CUDA 5, PyCUDA 2012.1, Tesla
> C2050. If you or any other kind soul is able to successfully run this
> gist, let me know! https://gist.github.com/4036693
> 
> Thanks again,
> Ahmed

When N*N > 512, the mismatch between array size
(np.double().nbytes*N*N) and the default alignment assumed by
pycuda.driver.aligned_empty() (4096) prevents all of the array elements from
being properly updated; if you preallocate a device-mapped array, you
don't need to worry about setting the alignment.

                                                      L.G.


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to