Hi Andreas,

The important bits that seem to the behaviour are:

import pyopencl as cl
import numpy as np

platform = cl.get_platforms()[0]
devs = platform.get_devices()
device1 = devs[0]

h_data = np.arange(81*9000).reshape(90,90,90).astype(np.float32,order='F')

ctx = cl.Context([device1])
queue = cl.CommandQueue(ctx)
queue2 = cl.CommandQueue(ctx)

h_image_shape=h_data.shape

f = open('Kernel.cl', 'r')
fstr = "".join(f.readlines())
prg = cl.Program(ctx, fstr).build()

d_image = cl.Image(ctx, mf.READ_ONLY, cl.ImageFormat(cl.channel_order.INTENSITY,cl.channel_type.FLOAT),h_image_shape) wev1 = cl.enqueue_copy(queue, d_image, h_data, is_blocking=False, origin=(0,0,0), region=h_image_shape)
prg.sum(queue2,(h_image_shape[0],),None,d_image,wait_for = [wev1])

The Kernel is doing some simple number crunching on the input image.

I'm measuring with nvvp, one result is attached, where you clearly see, that the kernel launches long before the copy has ended.
I allready tested with OoO disabled...same behaviour.
Implementation is 'Tesla K10.G2.8GB' on 'NVIDIA CUDA'.
Greets
Jonathan

Am 2016-04-05 15:35, schrieb Andreas Kloeckner:
"Schock, Jonathan" <jonathan.sch...@tum.de> writes:

Hi all,
I am not quite getting the function of the event system together with
non-blocking copies.

I want to enqueue a non-blocking copy function which returns an event
for a kernel to wait on:

wev1 = cl.enqueue_copy(IOqueue, device_image, host_image,
is_blocking=False, origin=(0,0,0), region=host_image_shape)
program.kernel_name(workqueue,(work_group_shape,),None,device_image,wait_for
= [wev1])

Both queues are defined as OoO queues in the same context on the same
device.
In my profiling I can see the start of the kernel, before the copying is
finished. Does that mean, I have to use blocking copies,
or am I doing something else wrong?

How are you measuring? What implementation is this on? (Only Intel CPU
supports OoO queues as of now, as far as I know.) Can you show code to
reproduce? FWIW, your code snippet looks correct to me, in the sense
that the kernel should see all results of the copy.

Andreas
_______________________________________________
PyOpenCL mailing list
PyOpenCL@tiker.net
https://lists.tiker.net/listinfo/pyopencl

Reply via email to