On Sun, 25 Dec 2011 06:28:02 -0800, Lewis Anderson <[email protected]> 
wrote:
> Hello,
> 
> I'm working on a neural network-based project which relies heavily on
> array operations. I have moved from Numpy to PyOpenCL in order to
> speed up these operations, and gotten great results (3.7x speedup on
> my laptop). I'm looking forward to even better results when I move to
> a better graphics card. However, in order to get optimal performance,
> I want to properly handle asynchronous behavior, but I am not sure how
> to do that when using builtin Array functions (+,-,/,abs(), fill(),
> etc).
> 
> I have defined several custom kernels, and used them successfully,
> along with some more primitive operations. These custom kernels all
> return event objects, which I can then use with the wait_for argument
> to synchronize execution. However, it seems that the only way to do
> this with the built in functions is by using queue.finish(), since
> they do not return event objections. Is there a more sophisticated way
> to do so?

That depends on a few things. First of all, I would not recommend using
pyopencl arrays with out-of-order queues (but I also know of no GPU CL
implementation that actually supports such queues). And if you have an
in-order queue, you can simply say pyopencl.enqueue_marker() to get an
event that will trigger once all previously enqueued array operations
finish. If you're using 2011.2, enqueue_marker also takes a wait_for
argument (for older versions you'd use the now-deprecated
enqueue_wait_for_events). Does that fit the bill?

Andreas

Attachment: pgpz6fxIR1H47.pgp
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to