[snipped code snippet]
OpenCL 1.2 spec, 6.9b):

An image type (image2d_t, image3d_t, image2d_array_t, image1d_t,
image1d_buffer_t or image1d_array_t) can only be used as the type of a
function argument. An image function argument cannot be modified. Elements of an
image can only be accessed using built-in functions described in section 
6.12.14.
An image type cannot be used to declare a variable, a structure or union field, 
an array of
images, a pointer to an image, or the return type of a function. An image type 
cannot be
used with the __private, __local and __constant address space qualifiers. The
image3d_t type cannot be used with the __write_only access qualifier unless the
cl_khr_3d_image_writes extension is enabled. An image type cannot be used
with the __read_write access qualifer which is reserved for future use.
That cleans that up. Much obliged for the answer. This leads to a direct followup though: is there a best practice to pass a variable amount of images to a kernel?

2) I pass the params_buf as __constant to my kernel. I have some
functions doing arithmetic with DualQuaternions and I have to first copy
all data from my structure before working with them: e.g.

void conjugate(const DualQuaternion * a, DualQuaternion * rv);

DualQuaternion rv;
conjugate(&measurement->w2c, &rv);
Gives this error:
passing 'DualQuaternion __attribute__((address_space(2)))const *' discards 
qualifiers, expected 'DualQuaternion const *'

DualQuaternion temp = measurement->w2c;
conjugate(&w2c, &rv);
is working okay.

I understand the reason for this I think: functions need to work in one
address space only. But is there a way to pass my structures to my kernel
that the explicit copy is not needed?

My advice would be to pass the arguments to conjugate() by value and use
a return value. This avoids issues of address space matching
(i.e. declaring __constant args in conjugate()), and any half-way smart
compiler will generate equivalent code anyway.
My profiling shoes that this not the case. Passing const DualQuaternion* is roughly 20% faster than passing const DualQuaternion. Maybe I need to activate optimization or so? Would that be cl.Program.build(["-O2"])? Thanks for your help, Andreas! I know personally how hard it is to keep up with support requests on projects and you do it single handedly for pyopencl and pycuda. It is much appreciated.

Kind Regards,
Holger Rapp


Attachment: pgpP7mqR1Dlt0.pgp
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to