On Donnerstag 06 Mai 2010, Ian Ozsvald wrote: > I've been speed testing some code to understand the complexity/speed > trade-off of various approaches. I want to offer my colleagues the > easiest way to use a GPU to get a decent speed-up without forcing > anyone to write C-like code if possible.
A few comments: - If you use 100 floats, you only measure launch overhead. The GPU processing time will be entirely negligible. You need a couple million entries to generate even a small bit of load. - You might want to "warm up" each execution path before you take your timings, to account for code being compiled or fetched from disk. HTH, Andreas
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda