Hi Christian, Christian Hacker <[email protected]> writes: > So my question is this: does referencing a GPUArray from within a numpy > array of objects entail some kind of ungodly overhead, and is there a > *good* way to store a "jagged" GPUArray?
FWIW, I use object arrays with GPUArrays in them all the time, and they work just fine. One thing to note is that a separate kernel will be launched to perform arithmetic on each of the sub-arrays. As a result, if the sub-array size is small enough that kernel launch overhead is comparable to the cost of the operation on the array, then you will start seeing a performance impact. I would say that as soon as the size of your sub-arrays is around 10,000 or so, you should be OK. If your sub-arrays are smaller and you care about every last bit of performance, you will likely need to roll a custom solution that stores segment boundaries along with the array. Hope that helps, Andreas
signature.asc
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
