On Thu, 12 Jun 2014 03:45:09 +0530 Abhilash Dighe <[email protected]> wrote:
> Hi, > > I was hoping to get some insight on my observations. I am using PyOpenCL > version 2 with NVIDIA Tesla M2090 to run my kernel which runs SHA1 > algorithm over variably sized data blocks. I'm running the same kernel I'm > trying to find the execution time for my kernel. But I'm getting different > readings for time for when I use the PyOpenCL's profiling tool and when I > use the standard python time library. My code is structured as: > > > hash_start = time.time() > hash_event = prog.sha1( queue , shape , None , in_buf , out_buf , ..<other > buffers> ) > hash_event.wait() > hash_end = time.time() > add_hash_CPU_time( hash_end - hash_start ) > add_hash_GPU_time( 1e-9 * ( hash_event.profile.end - > hash_event.profile.start ) ) > > These are the results for a test case of size 3 GB. The kernel gets called > 64 times and runs 12288 threads each time. > > Total OpenCL profiling time = 1.56s > Total CPU wall clock time = 13.79s > > I needed some help understanding what the cause for this inconsistency is. > Or is there any mistake I'm making in recording the data. Is your GPU in persistent mode ? (nvidia-smi) If not, the loading/unloading of the nvidia kernel driver can last for multiple seconds. Cheers, -- Jérôme Kieffer Data analysis unit - ESRF _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
