Hi Andreas, great to have you back.

I should have said - the 100 items example was just so a visual
comparison could be made to confirm that the results were equivalent.

Using 10,000,000 items in the list the timing *swaps* around:
cumath took: 0.179667449951  seconds
ElementwiseKernel took: 0.327829986572  seconds
so whereas cumath was slower before, now it is faster. Up until about
100,000 items I observed the previous pattern, my bad for not
extending the list.

I don't understand the above result so I guess I need to break out the profiler.

Could you confirm that my earlier assumptions are correct?
1) When a cumath operation is performed (e.g. cumath.sin()) the result
isn't copied back from the GPU to the CPU
2) If multiple cumath operations are applied in sequence to a piece of
data then the data still stays on the GPU (i.e. it doesn't have to be
copied back to the CPU then back to the GPU to apply each subsequent
cumath operation)
3) The act of applying .get() (e.g. "print sinop" in the code) to a
gpuarray is the only thing in the example below that causes the GPU
memory to be copied back to the CPU

The above is what I understand from looking at the code but some of it
is a bit beyond me, I just want to confirm that I understand what's
happening behind the scenes with the auto-generated code from Python.

Cheers,
Ian.

On 6 May 2010 23:21, Andreas Klöckner <li...@informa.tiker.net> wrote:
> On Donnerstag 06 Mai 2010, Ian Ozsvald wrote:
>> I've been speed testing some code to understand the complexity/speed
>> trade-off of various approaches. I want to offer my colleagues the
>> easiest way to use a GPU to get a decent speed-up without forcing
>> anyone to write C-like code if possible.
>
> A few comments:
>
> - If you use 100 floats, you only measure launch overhead. The GPU
>  processing time will be entirely negligible. You need a couple million
>  entries to generate even a small bit of load.
>
> - You might want to "warm up" each execution path before you take your
>  timings, to account for code being compiled or fetched from disk.
>
> HTH,
> Andreas
>
>
> _______________________________________________
> PyCUDA mailing list
> PyCUDA@tiker.net
> http://lists.tiker.net/listinfo/pycuda
>
>



-- 
Ian Ozsvald (A.I. researcher, screencaster)
i...@ianozsvald.com

http://IanOzsvald.com
http://morconsulting.com/
http://TheScreencastingHandbook.com
http://ProCasts.co.uk/examples.html
http://twitter.com/ianozsvald

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to