Re: [PyCUDA] Block and gridsize

Lev Givon Sat, 10 Dec 2011 17:50:04 -0800

Received from Fernando Benites on Sat, Dec 10, 2011 at 07:14:27PM EST:
> Hello!
> 
> I am trying to do very basic operations on two matrix (like minimum and sum).
> The matrix have different size but I must go one row and compare to
> all the other rows of the other matrix. So I divided the problem in
> blocks of size 256 for X and Y and the grid_size is calculated based
> on how many rows are left.
> I get grid_sizeX=6 and grid_sizeY=32. Unfortunately, pycuda complains:
> Traceback (most recent call last):
>   File "mlARAMv4_cuda.py", line 321, in <module>
>     nn.test(a['testData'].  todense(),a['testLabels'])
>   File "mlARAMv4_cuda.py", line 207, in test
>     block = (THREAD_SIZEx ,THREAD_SIZEy, THREAD_SIZEz) ,
> grid=(MATRIX_SIZEx,MATRIX_SIZEy)
>   File 
> "/usr/local/lib/python2.7/dist-packages/pycuda-2011.1.3-py2.7-linux-i686.egg/pycuda/driver.py",
> line 165, in function_call
>     func._set_block_shape(*block)
> Boost.Python.ArgumentError: Python argument types in
>     Function._set_block_shape(Function, numpy.int16, numpy.int16, numpy.int16)
> did not match C++ signature:
>     _set_block_shape(pycuda::function {lvalue}, int, int, int)
> 
> THREAD_SIZEz=1, THREAD_SIZEx=THREAD_SIZEy=256, which are the
> misleading names for the block size of each dimension.
> My card is a GeForce GTS 250 with 512mb.


This thread block configuration (256*256*1) exceeds the maximum number of 
threads
per block allowed by your GPU (512).

                                                        L.G.

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Block and gridsize

Reply via email to