Hi Andreas,

On Sat, Jan 28, 2012 at 3:23 AM, Andreas Kloeckner
<[email protected]> wrote:
> Indeed, inserting __syncthreads() after the
> shared array declaration brings the error down to more reasonable values
> for me. Jesse, my recommendation would be to use that as a workaround
> while we figure out a more permanent fix.

Can't we do this:
>> 1. Using "extern __shared__ out_type sdata[]" and setting the size of
>> shared memory when preparing the kernel.
We can pass dtype instead of ctype to
get_reduction_kernel_and_types(), and convert it to ctype + data size
inside.

Best regards,
Bogdan

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to