Hi,

I'm writing a relatively simple function to sum up a set of results from a
simulation I'm running, and want to do it on the GPU, since it involves
summing a large dataset. The function is as follows:


__kernel void k(__global float4 *spins,
>                 __global uint *result)
> {
>     ushort lid = 2 * get_local_id(0);
>     ushort gid = 2 * get_group_id(1);
>
>     float total = 0;
>     local uint l_total;
>
>     if(lid == 0)
>     {l_total = 0;}
>
>     barrier(CLK_LOCAL_MEM_FENCE);
>
>     total += spins[lid * Y + gid].w;
>     total += spins[lid * Y + gid + 1].w;
>     total += spins[(lid + 1) * Y + gid].w;
>     total += spins[(lid + 1) * Y + gid + 1].w;
>
>     uint t = (uint) total;
>
>     uint atomic_add(uint l_total, uint t);
>
>     barrier(CLK_LOCAL_MEM_FENCE);
>
>     if(lid == 0)
>     {uint atom_add(uint result[0], uint l_total);}
> }
> """
>

Note that this isn't my original code, I've been modifying it loads to try
and get it to work, to no avail. (Y is an uint that I declare elsewhere).
I've also tried adding __OVERLOADABLE__, __local and __global to the
additions, but that didn't help either.

The error message I get when compiling is:


pyopencl.RuntimeError: clBuildProgram failed: build program failure -
>
> Build on <pyopencl.Device 'GeForce GTX 560 Ti' on 'NVIDIA CUDA' at
> 0x29f4d80>:
>
> :36:10: error: overloaded function 'atom_add' must have the 'overloadable'
> attribute
>     uint atomic_add(uint l_total, uint t);
>          ^
> <built-in>:3942:20: note: instantiated from:
> #define atomic_add atom_add
>                    ^
> <built-in>:3845:24: note: previous overload of function is here
> ulong __OVERLOADABLE__ atom_add(__local volatile ulong *ptr, ulong val);
>                        ^
> :41:11: error: overloaded function 'atom_add' must have the 'overloadable'
> attribute
>     {uint atom_add(uint result[0], uint l_total);}
>           ^
> :36:10: note: previous overload of function is here
>     uint atomic_add(uint l_total, uint t);
>          ^
> <built-in>:3942:20: note: instantiated from:
> #define atomic_add atom_add
>                    ^
>

Any help would be greatly appreciated.

Thanks,
Justin
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to