On Dienstag 04 August 2009, Ahmed Fasih wrote:
> Andreas, I was also just starting to look at zero-copy memory, and I
> have a quick question.
>
> First, is there a PyCUDA equivalent of (in C)
> cudaSetDeviceFlags(cudaDeviceMapHost)?

http://documen.tician.de/pycuda/driver.html#pycuda.driver.Device.make_context
http://documen.tician.de/pycuda/driver.html#pycuda.driver.ctx_flags

(In particular, this also means you shouldn't use pycuda.autoinit.)

> Then, after creating an array with
> pycuda.driver.pagelocked_empty(shape, dtype,
> mem_flags=pycuda.driver.host_alloc_flags.DEVICEMAP), how exactly would
> I use pycuda.driver.HostAllocation.get_device_pointer() to get me the
> pointer in host memory to pass into my kernel?
>
> I see in test_driver.py that you use Out() and In(), I tried this with
> the pagelocked array above but my computation time didn't change---of
> course, zero-copy memory might not be beneficial to my application
> (though the suggested conditions are met: read or write only once,
> fully coalesced). But I wonder if Out() and In() are copying the
> arrays between device memory or if they're actually using the host>
> pagelocked memory?

In() and Out() always copy. That's not what you want. Instead, you want to 
pass just a pointer-sized integer: numpy.intp(whatever). You can avoid the 
explicit cast entirely if you use the prepared call style.

> Thanks for any hints. Once I get this working, I'll submit a test for
> test_driver.py that verifies zero-copy memory if there's interest.

Absolutely! If you need any more help, please don't hesitate to ask.

Andreas

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to