Rasmus Diederichsen <rasmusdiederich...@gmail.com> writes: > Is it possible to use Reduction operations to reduce a 2-d array to a > 1-d one, by e.g. computing the rowwise sum or some other operations? > So far I haven't been successful.
No--ReductionKernel is not meant for that. Its role is to do global reductions when there is *no* other source of concurrency available. In your situation, you can still parallelize over the non-summed axis, which will lead to vastly more efficient code. As a downside, there isn't really canned code to do that. But check out https://documen.tician.de/loopy/ It can help you write that kernel. Andreas _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net https://lists.tiker.net/listinfo/pycuda