On Dienstag 04 August 2009, Ahmed Fasih wrote: > Andreas, I was also just starting to look at zero-copy memory, and I > have a quick question. > > First, is there a PyCUDA equivalent of (in C) > cudaSetDeviceFlags(cudaDeviceMapHost)?
http://documen.tician.de/pycuda/driver.html#pycuda.driver.Device.make_context http://documen.tician.de/pycuda/driver.html#pycuda.driver.ctx_flags (In particular, this also means you shouldn't use pycuda.autoinit.) > Then, after creating an array with > pycuda.driver.pagelocked_empty(shape, dtype, > mem_flags=pycuda.driver.host_alloc_flags.DEVICEMAP), how exactly would > I use pycuda.driver.HostAllocation.get_device_pointer() to get me the > pointer in host memory to pass into my kernel? > > I see in test_driver.py that you use Out() and In(), I tried this with > the pagelocked array above but my computation time didn't change---of > course, zero-copy memory might not be beneficial to my application > (though the suggested conditions are met: read or write only once, > fully coalesced). But I wonder if Out() and In() are copying the > arrays between device memory or if they're actually using the host> > pagelocked memory? In() and Out() always copy. That's not what you want. Instead, you want to pass just a pointer-sized integer: numpy.intp(whatever). You can avoid the explicit cast entirely if you use the prepared call style. > Thanks for any hints. Once I get this working, I'll submit a test for > test_driver.py that verifies zero-copy memory if there's interest. Absolutely! If you need any more help, please don't hesitate to ask. Andreas
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ PyCUDA mailing list [email protected] http://tiker.net/mailman/listinfo/pycuda_tiker.net
