Hello,

I am just starting with opencl. I have notebook with an Intel HD BroadWell
U-Processor GT2 on Intel Gen OCL driver on Linux.

The following kernel get and return non-sense as input and output. Please
note the cl_khr_fp64: enable. The code runs fine on nvidia tesla in another
machine. Everything works fine with float32. Do you have any clue why fp64
is broken? Is it a driver issue? How can I dig further?

Many thanks!
Riccardo


import numpy as np
import pyopencl as cl

a_np = np.array([1,2,3]).astype('float64')
b_np = np.array([4,5,6]).astype('float64')

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

prg = cl.Program(ctx, r"""
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void sum(__global const double *a_g,
                  __global const double *b_g,
                  __global       double *res_g) {
  int gid = get_global_id(0);
  res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
prg.sum(queue, a_np.shape, None, a_g, b_g, res_g)

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g)
print res_np
print a_np+b_np





_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to