First, the cl_khr_fp16 extension is supported and has been enabled on the 
device Intel GPU.


When I run the code as follows on the device 'Intel(R) HD Graphics', which uses 
16-bit half-precision float instead of 32-bit float, it gets the error : 
loading directly from pointer to type 'const __global half' is not allowed.
But the device 'Intel(R) HD Graphics' supports half data types for OpenCL.
'''
Build on <pyopencl.Device 'Intel(R) HD Graphics' on 'Intel(R) OpenCL' at 
0x1cd1a10>:
1:6:16: error: loading directly from pointer to type 'const __global half' is 
not allowed
res_g[gid] = a_g[gid] + b_g[gid];
^
(options: -I /usr/lib/python3/dist-packages/pyopencl/cl)
'''
How can I fixed the code?
'''
from future import absolute_import, print_function
import numpy as np
import pyopencl as cl

a_np = np.random.rand(50000).astype(np.float16)
b_np = np.random.rand(50000).astype(np.float16)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx) # Create a command queue with your context

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

prg = cl.Program(ctx, """
__kernel void sum(
__global const half *a_g, __global const half *b_g, __global half *res_g)
{
int gid = get_global_id(0);
res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
prg.sum(queue, a_np.shape, None, a_g, b_g, res_g)

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g)

print(res_np)
print(res_np - (a_np + b_np))
print(np.linalg.norm(res_np - (a_np + b_np)))
'''

Finally, how can it correct my code?








 
_______________________________________________
PyOpenCL mailing list
PyOpenCL@tiker.net
https://lists.tiker.net/listinfo/pyopencl

Reply via email to