On Tue, 14 Feb 2012 09:51:03 +0100, Holger Rapp <[email protected]> wrote:
> > [snipped code snippet]
> >OpenCL 1.2 spec, 6.9b):
> >
> >An image type (image2d_t, image3d_t, image2d_array_t, image1d_t,
> >image1d_buffer_t or image1d_array_t) can only be used as the type of a
> >function argument. An image function argument cannot be modified. Elements 
> >of an
> >image can only be accessed using built-in functions described in section 
> >6.12.14.
> >An image type cannot be used to declare a variable, a structure or union 
> >field, an array of
> >images, a pointer to an image, or the return type of a function. An image 
> >type cannot be
> >used with the __private, __local and __constant address space qualifiers. The
> >image3d_t type cannot be used with the __write_only access qualifier unless 
> >the
> >cl_khr_3d_image_writes extension is enabled. An image type cannot be used
> >with the __read_write access qualifer which is reserved for future use.
> That cleans that up. Much obliged for the answer. This leads to a direct 
> followup though: is there a best practice to pass a variable amount of 
> images to a kernel?

Not sure that's possible. I think that number needs to be known
at compile time. (My response to that would be, well simply compile and
cache the kernel on-demand.)

> >> 2) I pass the params_buf as __constant to my kernel. I have some
> >> functions doing arithmetic with DualQuaternions and I have to first copy
> >> all data from my structure before working with them: e.g.
> >>
> >> void conjugate(const DualQuaternion * a, DualQuaternion * rv);
> >>
> >> DualQuaternion rv;
> >> conjugate(&measurement->w2c, &rv);
> >> Gives this error:
> >> passing 'DualQuaternion __attribute__((address_space(2)))const *' discards 
> >> qualifiers, expected 'DualQuaternion const *'
> >>
> >> DualQuaternion temp = measurement->w2c;
> >> conjugate(&w2c, &rv);
> >> is working okay.
> >>
> >> I understand the reason for this I think: functions need to work in one
> >> address space only. But is there a way to pass my structures to my kernel
> >> that the explicit copy is not needed?
> >
> >My advice would be to pass the arguments to conjugate() by value and use
> >a return value. This avoids issues of address space matching
> >(i.e. declaring __constant args in conjugate()), and any half-way smart
> >compiler will generate equivalent code anyway.
>
> My profiling shoes that this not the case. Passing const
> DualQuaternion* is roughly 20% faster than passing const
> DualQuaternion. Maybe I need to activate optimization or so? Would
> that be cl.Program.build(["-O2"])? 

Good question. If you're on Nv, maybe start by looking at the
PTX. (prg.binaries[0])

Andreas

Attachment: pgpvI021lX13g.pgp
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to