Hi Karl, here is the kernel below. Regarding your second point, I would
love to process all columns in one kernel but I want to avoid initializing
another entire matrix of the same size. To avoid this I am trying to only
initialize a vector of size = number of rows which can then be assigned to
t
Hi Charles,
can you please send us the kernel? Maybe there's something wrong with
the thread assignment there.
Also, rather than looping from 0 to P-1, it would make much more sense
to process all columns in parallel in a single kernel.
Best regards,
Karli
On 12/14/2016 06:01 PM, Charles Det
A quick addition, it also only seems to crash when the number of rows in
the input matrix match or exceed 1000 (i.e. it works with the trivial
example with 100 rows).
Charles
On Wed, Dec 14, 2016 at 10:55 AM, Charles Determan
wrote:
> I have a function where I use a custom opencl kernel. The f
I have a function where I use a custom opencl kernel. The function is
below. The function runs without problem and provides the correct result
after the *first time* I call it. However, if I try to call the function
again it crashes right after the 'initialized' output where it is trying to
add