Hi,

2010/8/3 Andreas Kloeckner <[email protected]>

> First of all, I don't understand why you're developing a
> stride-supporting GPUArray separately from PyCUDA. (Does it depend on
> all of Theano?) I'm sure lots of people would want that
> functionality--including myself! Would you be willing to contribute this
> code to PyCUDA in some form?
>

Some history information. When we started Theano on gpu, we wanted to use
CUBLAS as many of our algorith are bound by gemm  operation. At that time,
it was not possible to have the c driver api with the high level api as now.
Otherwise we would have started directly with pycuda. The second reason was
that PyCuda is missing a strided gpu array.

As now the c driver api vs the high level problem is fixed by Nvidia, we
want to use pycuda as our gpu back end. My goal right now is to make a
bridge that allow us to devellop now Theano op in pycuda while keeping our
old code base.

We don't have the time to make all the change happen at once or in a short
time. So the first step that I see is to make sure our strided gpu array is
contiguous then pass it to Theano pycuda op. That was what I did, but
requesting some pycuda change. I will redo it another way as to don't need
change from pycuda.

A good next step would be to have a pycuda strided gpu array and start using
it in Theano. I'm happy to see that you would like something like that. What
if we plan it a little begore we go their? I can try to put a few hour next
week for it.

Here is my first guest of what we need to do: GPUarray need a stride field
to allow it support stride. Then we need to change the code generated to
work with stride too. I think we should make this array don't work with
current code that expect a c contiguous array. That way their will be error
instead of bad result returned. So I think GPUStridedArray should not
inhering from GPUarray. Also, I would rename the member gpudata to
stridedgpudata so as to make old code don't work. Do you think of a better
way to don't make old code work?

After the interface is done, I could port some of our code generator to it
to give it many of the function that GPUarray support. But be warning that
the first version won't be optimal in many case. We did not optimize
correctly many case as they are not bottleneck we have.

Do you have any comments/questions about the GPUStridedArray?

I understand that my patch won't be used by enough people to put them into
the sensible place they are. As told, I will redo my current bridge without
them.

Fred
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to