Hi, 2010/8/3 Andreas Kloeckner <[email protected]>
> First of all, I don't understand why you're developing a > stride-supporting GPUArray separately from PyCUDA. (Does it depend on > all of Theano?) I'm sure lots of people would want that > functionality--including myself! Would you be willing to contribute this > code to PyCUDA in some form? > Some history information. When we started Theano on gpu, we wanted to use CUBLAS as many of our algorith are bound by gemm operation. At that time, it was not possible to have the c driver api with the high level api as now. Otherwise we would have started directly with pycuda. The second reason was that PyCuda is missing a strided gpu array. As now the c driver api vs the high level problem is fixed by Nvidia, we want to use pycuda as our gpu back end. My goal right now is to make a bridge that allow us to devellop now Theano op in pycuda while keeping our old code base. We don't have the time to make all the change happen at once or in a short time. So the first step that I see is to make sure our strided gpu array is contiguous then pass it to Theano pycuda op. That was what I did, but requesting some pycuda change. I will redo it another way as to don't need change from pycuda. A good next step would be to have a pycuda strided gpu array and start using it in Theano. I'm happy to see that you would like something like that. What if we plan it a little begore we go their? I can try to put a few hour next week for it. Here is my first guest of what we need to do: GPUarray need a stride field to allow it support stride. Then we need to change the code generated to work with stride too. I think we should make this array don't work with current code that expect a c contiguous array. That way their will be error instead of bad result returned. So I think GPUStridedArray should not inhering from GPUarray. Also, I would rename the member gpudata to stridedgpudata so as to make old code don't work. Do you think of a better way to don't make old code work? After the interface is done, I could port some of our code generator to it to give it many of the function that GPUarray support. But be warning that the first version won't be optimal in many case. We did not optimize correctly many case as they are not bottleneck we have. Do you have any comments/questions about the GPUStridedArray? I understand that my patch won't be used by enough people to put them into the sensible place they are. As told, I will redo my current bridge without them. Fred
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
