On Tue, Aug 25, 2009 at 10:10 PM, Andreas Klöckner<[email protected]> wrote: > On Dienstag 25 August 2009, James Bergstra wrote: >> Does PyCuda support broadcasting? How can I add a vector the rows or >> columns of a 2 or 3-dimensional GpuArray? >> Related: does PyCuda support viewing of sub-regions of other >> GpuArrays? Like, can I operate on just the first few rows or columns >> of a matrix? > > Sub-region views are implemented in 1D, but no feature that requires "true" > multidimensional arrays is implemented just yet. However, that functionality > is definitely in the plan. If you need it sooner, patches are welcome. > > Andreas
I've been working on this sort of thing in my own corner, and was hoping today that PyCUDA might already have done some of the optimization of elementwise functions for different kinds of memory layouts and broadcasting patterns. It's not straightforward. It probably requires the expertise of a few people to get the design right, so I'm reluctant even to try to put a patch together. First, it requires some changes to the data container. Some of the issues that come up are: - what should be the strides for broadcastable dimensions (I like 0, but numpy does it differently) - should strides be in data-type units or byte units - should strides and dimensions be stored in host memory, device memory, or both (how/when should they be synchronized?) As the data structure gets more complicated, the kernels become more complex too. My experience is that all kernels have to have a "general" version that is pretty slow, and progressively, more and more special cases get optimized. Kernel code generators get bloated. How many kinds of kernels are there in PyCUDA right now? (Given that the same code-generator can produce many elementwise kernels, I mean to count that as one *kind* of kernel.) How many things would break if arrays were strided? James -- http://www-etud.iro.umontreal.ca/~bergstrj _______________________________________________ PyCUDA mailing list [email protected] http://tiker.net/mailman/listinfo/pycuda_tiker.net
