Luke Pfister <[email protected]> writes:
> Is there a suggested way to do the equivalent of np.sum along a particular
> axis for a high-dimensional GPUarray?
>
> I saw that this was discussed in 2009, before GPUarrays carried stride
> information.

Hand-writing a kernel is probably still your best option. Just map the
non-reduction axes to the grid/thread block axes, and write a for loop
to do the summation.

HTH,
Andreas

Attachment: signature.asc
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to