Re: [PyCUDA] Compiling thrust code in pyCUDA

Bryan Catanzaro Wed, 30 May 2012 22:07:20 -0700

Why should the overhead be measured separately?  For users of these
systems, the Python overhead is unavoidable.  The time spent running
on the GPU alone is an important implementation detail for people
improving systems like PyCUDA, but users of these systems see overhead
costs exposed in their overall application performance, and so I don't
see how the overhead can be ignored.


- bryan

On Wed, May 30, 2012 at 9:47 PM, Andreas Kloeckner
<kloeck...@cims.nyu.edu> wrote:
> On Wed, 30 May 2012 20:31:40 -0700, Bryan Catanzaro <bcatanz...@acm.org> 
> wrote:
>> Hi Igor -
>> I meant that it's more useful to know the execution time of code
>> running on the GPU from Python's perspective, since Python is the one
>> driving the work, and the execution overheads can be significant.
>> What timings do you get when you use timeit rather than CUDA events?
>> Also, what GPU are you running on?
>
> timeit isn't really the right way to measure this, I think. There's some
> amount of Python overhead, of course, and it should be measured
> separately (and of course reduced, if possible). Once that's done, see
> how long the GPU works on its part of the job for a few vector sizes,
> and then figure out the vector size above which the Python time is as
> long as the GPU time and see where that sits compared to your typical
> data size.
>
> That would be more useful, IMO.
>
> Andreas
>
> --
> Andreas Kloeckner
> Room 1105A (Warren Weaver Hall), Courant Institute, NYU
> http://www.cims.nyu.edu/~kloeckner/
> +1-401-648-0599

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Compiling thrust code in pyCUDA

Reply via email to