kejingying <kejingy...@126.com> writes: > First, the cl_khr_fp16 extension is supported and has been enabled on the > device Intel GPU. > > > When I run the code as follows on the device 'Intel(R) HD Graphics', which > uses 16-bit half-precision float instead of 32-bit float, it gets the error : > loading directly from pointer to type 'const __global half' is not allowed. > But the device 'Intel(R) HD Graphics' supports half data types for OpenCL. > ''' > Build on <pyopencl.Device 'Intel(R) HD Graphics' on 'Intel(R) OpenCL' at > 0x1cd1a10>: > 1:6:16: error: loading directly from pointer to type 'const __global half' is > not allowed > res_g[gid] = a_g[gid] + b_g[gid]; > ^ > (options: -I /usr/lib/python3/dist-packages/pyopencl/cl) > ''' > How can I fixed the code? > ''' > from future import absolute_import, print_function > import numpy as np > import pyopencl as cl > > a_np = np.random.rand(50000).astype(np.float16) > b_np = np.random.rand(50000).astype(np.float16) > > ctx = cl.create_some_context() > queue = cl.CommandQueue(ctx) # Create a command queue with your context > > mf = cl.mem_flags > a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np) > b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np) > > prg = cl.Program(ctx, """ > __kernel void sum( > __global const half *a_g, __global const half *b_g, __global half *res_g) > { > int gid = get_global_id(0); > res_g[gid] = a_g[gid] + b_g[gid]; > } > """).build() > > res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes) > prg.sum(queue, a_np.shape, None, a_g, b_g, res_g) > > res_np = np.empty_like(a_np) > cl.enqueue_copy(queue, res_np, res_g) > > print(res_np) > print(res_np - (a_np + b_np)) > print(np.linalg.norm(res_np - (a_np + b_np))) > ''' > > Finally, how can it correct my code?
OP resolved at: https://github.com/inducer/pyopencl/issues/254#issuecomment-436604464 _______________________________________________ PyOpenCL mailing list PyOpenCL@tiker.net https://lists.tiker.net/listinfo/pyopencl