Re: [PyCUDA] Possible source of bugs in gpuarray.to_gpu()

Andreas Kloeckner Tue, 05 Jul 2011 07:06:53 -0700

Hi Bogdan,

On Tue, 5 Jul 2011 19:17:52 +1000, Bogdan Opanchuk wrote:

I just bumped into a certain problem with copying numpy arrays toGPU.

Consider the following code:


---

import pycuda.autoinit
import pycuda.gpuarray as gpuarray
from pycuda.elementwise import ElementwiseKernel
import numpy

arr = numpy.random.randn(50, 50).astype(numpy.float32)
arr_tr = arr.transpose()

target = numpy.empty_like(arr)
arr_gpu = gpuarray.to_gpu(arr_tr)
target_gpu = gpuarray.to_gpu(target)

copy = ElementwiseKernel(
        "float *x, float *y",
        "x[i] = y[i]")
copy(target_gpu, arr_gpu)

print numpy.linalg.norm(target_gpu.get() - arr_gpu.get())

---

Here array 'arr' is transposed, copied to GPU and copied totarget_gpu

via simple elementwise kernel. Surprisingly, it turns out that there
is a non-zero difference between target_gpu and arr_gpu.

The explanation is that transpose() does not do actual transpose ---
it just swaps strides of 'arr'. Then gpuarray.to_gpu() copies linear
memory (which is unchanged) inside 'arr_tr' and its strides to
'arr_gpu'. If we now perform arr_gpu.get(), the result will be equal
to 'arr_tr', because the strides will be copied back too. But inside
the kernel we do not know about the strides, and copy supposedly
transposed data linearly. 'target_gpu' has strides of 'arr', not
'arr_tr', so when copied back to CPU, the same linear data is read in
a different way, which leads to norm() being non-zero.

Now I must say that this approach to copying is logical in its ownway

--- but if we copy strides to GPUArray, one will have to pass them to
kernels every time; plus many kernels are optimized based on the
assumption of 'normal' layout of data. So I think it would be more
convenient if to_gpu() performed flatten() (or something like this)
before copying. Otherwise some warning in the documentation is
necessary.

Ok, we should introduce a warning when to_gpu'ing arrays that are notin C order. And probably also add a functiongpuarray.i_know_about_strides() to turn that warning off.

In addition, I guess it's an issue that ElementwiseKernel happilycopies between arrays with different strides. I guess we could warnabout that, too, although it's a bit harder to tell where legitimate useends and abuse starts.


Can you make a patch for this?

Andreas


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Possible source of bugs in gpuarray.to_gpu()

Reply via email to