Hi Frédéric, On Fri, 15 Apr 2011 14:31:54 -0400, Frédéric Bastien <[email protected]> wrote: > In the following case I have an unexpected behavior with the > gpuarray.to_gpu() function: > > import numpy > > import pycuda.autoinit > import pycuda.gpuarray > > A=numpy.random.rand(3,3) > A_GPU=pycuda.gpuarray.to_gpu(A) > # work as expected > assert numpy.allclose(A_GPU.get(),A) > > > AT=A.T > AT_GPU=pycuda.gpuarray.to_gpu(AT) > # FAIL! > assert numpy.allclose(AT_GPU.get(),AT) > > The problem is that the function to_gpu() copy the memory buffer of > the numpy array without checking the stride. This is equivalent to > suppose that everything in the numpy array is always c contiguous. In > the second case, it is not the case. > > Is there some reason or explanation for this behavior? > > I know that gpuarray don't have stride and as such are just a memory > buffer with a shape attribute for convenience. I think that in the > case when the data is not c contiguous on the cpu, pycuda should 1) > raise an error or 2) make a contiguous copy and use that for the > transfert(there is optimization possible, but I don't talk about > that).
Your test passes now. (and has been added to PyOpenCL's and PyCUDA's unit tests) Both packages' device arrays now store (and restore, on the host) stride information and will not willy-nilly copy arrays that are non-contiguous, on either the CPU or the GPU side. This is a start towards proper stride support on the compute device, which I hope both packages will eventually have. Thanks for the report. Andreas
pgpa4BsEJCKUF.pgp
Description: PGP signature
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
