kejingying  <kejingy...@126.com> writes:
> First, the cl_khr_fp16 extension is supported and has been enabled on the 
> device Intel GPU.
>
>
> When I run the code as follows on the device 'Intel(R) HD Graphics', which 
> uses 16-bit half-precision float instead of 32-bit float, it gets the error : 
> loading directly from pointer to type 'const __global half' is not allowed.
> But the device 'Intel(R) HD Graphics' supports half data types for OpenCL.
> '''
> Build on <pyopencl.Device 'Intel(R) HD Graphics' on 'Intel(R) OpenCL' at 
> 0x1cd1a10>:
> 1:6:16: error: loading directly from pointer to type 'const __global half' is 
> not allowed
> res_g[gid] = a_g[gid] + b_g[gid];
> ^
> (options: -I /usr/lib/python3/dist-packages/pyopencl/cl)
> '''
> How can I fixed the code?
> '''
> from future import absolute_import, print_function
> import numpy as np
> import pyopencl as cl
>
> a_np = np.random.rand(50000).astype(np.float16)
> b_np = np.random.rand(50000).astype(np.float16)
>
> ctx = cl.create_some_context()
> queue = cl.CommandQueue(ctx) # Create a command queue with your context
>
> mf = cl.mem_flags
> a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
> b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)
>
> prg = cl.Program(ctx, """
> __kernel void sum(
> __global const half *a_g, __global const half *b_g, __global half *res_g)
> {
> int gid = get_global_id(0);
> res_g[gid] = a_g[gid] + b_g[gid];
> }
> """).build()
>
> res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
> prg.sum(queue, a_np.shape, None, a_g, b_g, res_g)
>
> res_np = np.empty_like(a_np)
> cl.enqueue_copy(queue, res_np, res_g)
>
> print(res_np)
> print(res_np - (a_np + b_np))
> print(np.linalg.norm(res_np - (a_np + b_np)))
> '''
>
> Finally, how can it correct my code?

OP resolved at:
https://github.com/inducer/pyopencl/issues/254#issuecomment-436604464

_______________________________________________
PyOpenCL mailing list
PyOpenCL@tiker.net
https://lists.tiker.net/listinfo/pyopencl

Reply via email to