note that the to_device function also needs to be changed, sorry for the
haste/carelessness. I'm not sure how you want to do it: checking in the
function itself seems slow (a python object's numpy.flags returns a string).
I also don't know what the performance / accuracy cost would be of passing
the object instead of the buffer cast to the memcpy function.

regards,
Nicholas

On Sun, Mar 1, 2009 at 13:24, Nicholas Tung <[email protected]> wrote:

> It's good that non-linear arrays fail. As for non-contiguous ones, I think
> it would be much better to check (it's just a flag check) instead of relying
> on the user to read all of the documentation (I certainly don't).
>
> Since you probably want to quality control your source (and my sandbox is
> messy), attached is a suggested solution--rename memcpy_htod to something
> like memcpy_htod_unchecked (expose to the user), create the function below,
> and document it appropriately. Attached is a simple python test.
>
> void py_memcpy_htod(CUdeviceptr dst, py::object src, py::object stream_py)
>   {
>     PyArrayObject *arr = (PyArrayObject *)src.ptr();
>     if (PyArray_Check(arr) && !(arr->flags & 1)) {
>         throw std::runtime_error("[memcpy] Array not C contiguous; "
>             "see \"Device Interface Reference Documentation\"");
>     }
>     py_memcpy_htod_unchecked(dst, src, stream_py);
>   }
>
> regards,
> Nicholas
>
>
> On Sun, Mar 1, 2009 at 12:12, Andreas Klöckner <[email protected]>wrote:
>
>> On Sonntag 01 März 2009, you wrote:
>> > > What's the failure? If it's something non-intuitive, we should catch
>> it
>> > > in PyCuda and give a nicer warning.
>> >
>> > The failure is the wrong data is transferred to the kernel; it appeared
>> to
>> > be something like the array transposed (which, needless to say, can be
>> very
>> > bad, particularly if loop bounds are taken from corrupted memory).
>>
>> numpy supports arbitrary strides in its arrays, which, among other things,
>> can
>> make them column- or row-major (ie. have Fortran or C order). GPUArray
>> currently has no stride support whatsoever. In the long run, having stride
>> support in GPUArray would likely be desirable. Introducing strides would
>> allow
>> us to introduce indexing in the same way that numpy allows.
>>
>> Further, numpy allows many types of funky arrays (non-contiguous, for
>> example). PyCuda currently does very little to support these funky arrays,
>> but
>> at least it doesn't behave incorrectly:
>>
>> >>> import pycuda.autoinit
>> >>> import pycuda.gpuarray as ga
>> >>> import numpy
>> >>> z = numpy.zeros((10,10), dtype=numpy.float32)
>> >>> ga.to_gpu(z[:,2:3])
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>>  File
>> "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
>> py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 401, in to_gpu
>>    result.set(ary, stream)
>>  File
>> "/home/kloeckner/src/env/lib/python2.5/site-packages/pycuda-0.93beta-
>> py2.5-linux-x86_64.egg/pycuda/gpuarray.py", line 91, in set
>>    drv.memcpy_htod(self.gpudata, ary, stream)
>> TypeError: expected a single-segment buffer object
>>
>> This is easy to work around for now--a simple .copy() and things work.
>>
>> > Looks like C_CONTIGUOUS is what we're looking for. The numpy
>> documentation
>> > mentions this and a possibly applicable function call:
>> > http://numpy.scipy.org/numpydoc/numpy-13.html#marker-59740
>>
>> In a sense, PyCuda merely did what it was asked to do, which is transfer
>> the
>> numpy array in the exact same layout that it had on the host. On the one
>> hand,
>> I intentionally transfer Fortran-layout arrays onto the GPU in some of my
>> code, and I think that's perfectly fine behavior.
>>
>> You have a point in that, at present, none of the stride information in
>> the
>> numpy array is preserved in a GPUArray copy, which means that
>> gpuarray.to_gpu(a).get() may result in many funny things, but only for C-
>> contiguous arrays will you get back out what you put in. This is a bug and
>> needs to be fixed, but the fix would likely be a part of the stride
>> implementation cited above.
>>
>> If, in the meantime, you want to phrase a warning for the documentation,
>> I'd
>> be happy to merge that.
>>
>> Andreas
>>
>>
>
_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to