I resended it to pyopencl as it failed the first time. Fred
---------- Forwarded message ---------- From: Frédéric Bastien <[email protected]> Date: 2011/4/22 Subject: Re: [PyCUDA] gpuarray.to_gpu() strange behavior To: Andreas Kloeckner <[email protected]> Cc: [email protected], [email protected] Hi, Sorry for the delay, we had a 22h black-out at our department that was followed by one of our main switch broken for 2 days... Thanks for yours modification. Here is some questions/comments: I'm not sure I understand the idea in the to_gpu() function that put the same stride on the GpuArray as on the host, but we copy the data to make it c contiguous before we transfer it. That way the stride in the GpuArray won't represent what is on the gpu. In that case, I think we should put the stride as the contiguous version of the data. That way the stride on the GpuArray represent what is on the gpu, not the original strides. When I talked about having a strided GpuArray, I was thinking about having a new object with a different attribute name for the gpudata attribute. The goal is to make it not compatible with current code without modification. The reason is that the current code don't check stride and will give wrong answer if it receive a strided GpuArray. I would have preferred an error raised. A simple change to the code to work correctly with strided data would be to call a function that make the data c contiguous and use that. A longer fix would be to use the stride information in the gpu code. For example, your current code generator for elemwise will simply ignore the stride information. I'm sure there is other code in pycuda and outside that need modification to won't return a wrong result silently. There will be a small workshop at the beginning of May on gpu computing for machine learning and I will try to make a proposition there and to get feed back from people there(Nicolas Pinto will attend it). There is not only the stride question that need to be answered. When I have something written, I will send it to you as you seam interested. I think we can compare the current python gpu base array object as before numpy exit. i.e. There is many variable that are not directly compatible, but very close. I just will try to make a proposition to include other variable at the same time and try to have more people contributing to it. Thanks Frédéric Bastien _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
