The visual profiler shows overlaping mem-copies and execution for the
Working.py. You are probably staring at your computer so if you are in
doubt, try it :D

(and this was one of my original questions ... how do you profile the
code if the profiler is obviously broken?)

-Magnus

On Mon, Mar 21, 2011 at 8:04 PM, Andreas Kloeckner
<li...@buster.tiker.net> wrote:
> On Mon, 21 Mar 2011 19:55:31 +0100, Magnus Paulsson <paulsso...@gmail.com> 
> wrote:
>> > Wild theory: Maybe the print statements introduce GPU synchronization?
>> > Does your observation change with multiple loops through the code?
>> >
>> > Also note that the profiler won't help you debug overlap. If it is
>> > active, all GPU activity is synchronous.
>> >
>> > Andreas
>>
>> No. None of the above. The "Working.py" code runs overlapping using
>> the profiler including print statments.
>
> CUDA 4.0 programming guide, 3.2.5.1:
>
> "When an application is run via a CUDA debugger or profiler (cuda-gdb, CUDA
> Visual Profiler, Parallel Nsight), all launches are synchronous."
>
> (and that sentence has been around for a few versions)
>
> Either you are or that sentence is wrong. :)
>
> Andreas
>
>



-- 

-----------------------------------------------
Magnus Paulsson
Assistant Professor
School of Computer Science, Physics and Mathematics
Linnaeus University
Phone: +46-480-446308
Mobile: +46-70-6942987

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to