Hi Andreas, great to have you back. I should have said - the 100 items example was just so a visual comparison could be made to confirm that the results were equivalent.
Using 10,000,000 items in the list the timing *swaps* around: cumath took: 0.179667449951 seconds ElementwiseKernel took: 0.327829986572 seconds so whereas cumath was slower before, now it is faster. Up until about 100,000 items I observed the previous pattern, my bad for not extending the list. I don't understand the above result so I guess I need to break out the profiler. Could you confirm that my earlier assumptions are correct? 1) When a cumath operation is performed (e.g. cumath.sin()) the result isn't copied back from the GPU to the CPU 2) If multiple cumath operations are applied in sequence to a piece of data then the data still stays on the GPU (i.e. it doesn't have to be copied back to the CPU then back to the GPU to apply each subsequent cumath operation) 3) The act of applying .get() (e.g. "print sinop" in the code) to a gpuarray is the only thing in the example below that causes the GPU memory to be copied back to the CPU The above is what I understand from looking at the code but some of it is a bit beyond me, I just want to confirm that I understand what's happening behind the scenes with the auto-generated code from Python. Cheers, Ian. On 6 May 2010 23:21, Andreas Klöckner <li...@informa.tiker.net> wrote: > On Donnerstag 06 Mai 2010, Ian Ozsvald wrote: >> I've been speed testing some code to understand the complexity/speed >> trade-off of various approaches. I want to offer my colleagues the >> easiest way to use a GPU to get a decent speed-up without forcing >> anyone to write C-like code if possible. > > A few comments: > > - If you use 100 floats, you only measure launch overhead. The GPU > processing time will be entirely negligible. You need a couple million > entries to generate even a small bit of load. > > - You might want to "warm up" each execution path before you take your > timings, to account for code being compiled or fetched from disk. > > HTH, > Andreas > > > _______________________________________________ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > > -- Ian Ozsvald (A.I. researcher, screencaster) i...@ianozsvald.com http://IanOzsvald.com http://morconsulting.com/ http://TheScreencastingHandbook.com http://ProCasts.co.uk/examples.html http://twitter.com/ianozsvald _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda