On Sonntag 01 März 2009, you wrote: > > What's the failure? If it's something non-intuitive, we should catch it > > in PyCuda and give a nicer warning. > > The failure is the wrong data is transferred to the kernel; it appeared to > be something like the array transposed (which, needless to say, can be very > bad, particularly if loop bounds are taken from corrupted memory).
numpy supports arbitrary strides in its arrays, which, among other things, can
make them column- or row-major (ie. have Fortran or C order). GPUArray
currently has no stride support whatsoever. In the long run, having stride
support in GPUArray would likely be desirable. Introducing strides would allow
us to introduce indexing in the same way that numpy allows.
Further, numpy allows many types of funky arrays (non-contiguous, for
example). PyCuda currently does very little to support these funky arrays, but
at least it doesn't behave incorrectly:
>>> import pycuda.autoinit
>>> import pycuda.gpuarray as ga
>>> import numpy
>>> z = numpy.zeros((10,10), dtype=numpy.float32)
>>> ga.to_gpu(z[:,2:3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 401, in to_gpu
result.set(ary, stream)
File "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 91, in set
drv.memcpy_htod(self.gpudata, ary, stream)
TypeError: expected a single-segment buffer object
This is easy to work around for now--a simple .copy() and things work.
> Looks like C_CONTIGUOUS is what we're looking for. The numpy documentation
> mentions this and a possibly applicable function call:
> http://numpy.scipy.org/numpydoc/numpy-13.html#marker-59740
In a sense, PyCuda merely did what it was asked to do, which is transfer the
numpy array in the exact same layout that it had on the host. On the one hand,
I intentionally transfer Fortran-layout arrays onto the GPU in some of my
code, and I think that's perfectly fine behavior.
You have a point in that, at present, none of the stride information in the
numpy array is preserved in a GPUArray copy, which means that
gpuarray.to_gpu(a).get() may result in many funny things, but only for C-
contiguous arrays will you get back out what you put in. This is a bug and
needs to be fixed, but the fix would likely be a part of the stride
implementation cited above.
If, in the meantime, you want to phrase a warning for the documentation, I'd
be happy to merge that.
Andreas
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ PyCuda mailing list [email protected] http://tiker.net/mailman/listinfo/pycuda_tiker.net
