Received from Fernando Benites on Sat, Dec 10, 2011 at 07:14:27PM EST:
> Hello!
>
> I am trying to do very basic operations on two matrix (like minimum and sum).
> The matrix have different size but I must go one row and compare to
> all the other rows of the other matrix. So I divided the problem in
> blocks of size 256 for X and Y and the grid_size is calculated based
> on how many rows are left.
> I get grid_sizeX=6 and grid_sizeY=32. Unfortunately, pycuda complains:
> Traceback (most recent call last):
> File "mlARAMv4_cuda.py", line 321, in <module>
> nn.test(a['testData']. todense(),a['testLabels'])
> File "mlARAMv4_cuda.py", line 207, in test
> block = (THREAD_SIZEx ,THREAD_SIZEy, THREAD_SIZEz) ,
> grid=(MATRIX_SIZEx,MATRIX_SIZEy)
> File
> "/usr/local/lib/python2.7/dist-packages/pycuda-2011.1.3-py2.7-linux-i686.egg/pycuda/driver.py",
> line 165, in function_call
> func._set_block_shape(*block)
> Boost.Python.ArgumentError: Python argument types in
> Function._set_block_shape(Function, numpy.int16, numpy.int16, numpy.int16)
> did not match C++ signature:
> _set_block_shape(pycuda::function {lvalue}, int, int, int)
>
> THREAD_SIZEz=1, THREAD_SIZEx=THREAD_SIZEy=256, which are the
> misleading names for the block size of each dimension.
> My card is a GeForce GTS 250 with 512mb.
This thread block configuration (256*256*1) exceeds the maximum number of
threads
per block allowed by your GPU (512).
L.G.
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda