Re: [PyOpenCL] Inconsistency in PyOpenCL profiling tool and Python Wall Clock when measuring kernel execution time

Jerome Kieffer Thu, 12 Jun 2014 11:34:41 -0700

On Thu, 12 Jun 2014 03:45:09 +0530
Abhilash Dighe <[email protected]> wrote:


> Hi,
> 
> I was hoping to get some insight on my observations. I am using PyOpenCL
> version 2 with NVIDIA Tesla M2090 to run my kernel which runs SHA1
> algorithm over variably sized data blocks. I'm running the same kernel  I'm
> trying to find the execution time for my kernel. But I'm getting different
> readings for time for when I use the PyOpenCL's profiling tool and when I
> use the standard python time library. My code is structured as:
> 
> 
> hash_start = time.time()
> hash_event = prog.sha1( queue , shape , None , in_buf , out_buf , ..<other
> buffers> )
> hash_event.wait()
> hash_end = time.time()
> add_hash_CPU_time( hash_end - hash_start )
> add_hash_GPU_time( 1e-9 * ( hash_event.profile.end -
> hash_event.profile.start ) )
> 
> These are the results for a test case of size 3 GB. The kernel gets called
> 64 times and runs 12288 threads each time.
> 
> Total OpenCL profiling time = 1.56s
> Total CPU wall clock time = 13.79s
> 
> I needed some help understanding what the cause for this inconsistency is.
> Or is there any mistake I'm making in recording the data.

Is your GPU in persistent mode ? (nvidia-smi)
If not, the loading/unloading of the nvidia kernel driver can last for multiple 
seconds.

Cheers,

-- 
Jérôme Kieffer
Data analysis unit - ESRF

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Re: [PyOpenCL] Inconsistency in PyOpenCL profiling tool and Python Wall Clock when measuring kernel execution time

Reply via email to