Hi,
Just to let you know that the problem was indeed fixed using:
opencl/kernels/vector.hpp::avbv:
opencl_source(template,statements, binding_policy)
in opencl/vector_operations.hpp::avbv
enqueue(template, program, statements, binding_policy)
This is not the safest thing ever, since template, s
Hey,
There is no such generator objects. Actually, the generation of the source
for each kernel is independent. The interface of the generator is now much
clearer and really only requires
generate_opencl_source(template, list of statements corresponding to the
template, throws an exception if the
Hey,
> There is one program of around 1000 kernels. For comparison, the current
> vector operations program take 2seconds to compile on my laptop. Since
> the 1.6seconds are including all the flip, reciprocal, but not the x =
> a*x + b*y -like operations, I suspect that this would be better in eve
Hi,
2014-05-28 10:25 GMT+02:00 Karl Rupp :
> Hi,
>
>
> > An additional information. Now that the test passes, i can safely
>
>> compare the JIT compilation times.
>> Without the optimized kernels : 1.6seconds
>> With the optimized kernels : 6.5 seconds
>> I fear that this might get untractable w
Hi,
> An additional information. Now that the test passes, i can safely
> compare the JIT compilation times.
> Without the optimized kernels : 1.6seconds
> With the optimized kernels : 6.5 seconds
> I fear that this might get untractable when further operations are
> added. I should probably find
An additional information. Now that the test passes, i can safely compare
the JIT compilation times.
Without the optimized kernels : 1.6seconds
With the optimized kernels : 6.5 seconds
I fear that this might get untractable when further operations are added. I
should probably find a way to disable
Hello,
The integration of the kernel generator has been a nightmare! Anyway, I've
realized that thousands of kernels per scalartype are required, in order to
obtain optimal performance. Why so much?
- flip_a, reciprocal_a, flip_b, reciprocal_b requiring their own kernel
- The generator interprets