[PyCUDA] General question about CUDA compiler and early returns

2015-10-26 Thread Walter White
Hello,

I have a question and hope that you can help me.
I am trying to find the bottleneck in my code but I can't get a
grip at the moment.

For a while I thought it was the writes to global memory
At the moment I am using an early "return" statement in my
code to skip parts of the code, e.g. a for-loop.

Now I am wondering if this is working at all.
Could it be that the code exits even way before
the "return" statement when the compiler recognizes that
calculations done in a for-loop are not written to
global memory or used anywhere else?

Kind regards,
Joe
___
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda


[PyCUDA] Questions on pinned memory

2015-10-05 Thread Walter White
Hello,

I have a question about pinned memory and hope that you can help me.

I found out that copying data from device to host takes
a very big part of my runtime, so I read about the issue
and came across "pinned memory".

There are several examples on the mailing list but I am not
sure if I am doing this the right way.

Do I need to initialize with drv.ctx_flags.MAP_HOST
or is this automatically activated if one of the
functions below is used?

drv.init()
dev = drv.Device(0)
ctx = dev.make_context(drv.ctx_flags.SCHED_AUTO | drv.ctx_flags.MAP_HOST)


Is drv.mem_host_register_flags.DEVICEMAP also needed if
the context is initialized with drv.ctx_flags.MAP_HOST ?

I found several methods that should do this
but none of them seems to work.
Are they all equivalent?

--
x = drv.register_host_memory(x, flags=drv.mem_host_register_flags.DEVICEMAP)
x_gpu_ptr = np.intp(x.base.get_device_pointer())

--
x = drv.pagelocked_empty(shape=x.shape, dtype=np.float32,
mem_flags=drv.mem_host_register_flags.DEVICEMAP)
--

from pycuda.tools import PageLockedMemoryPool
pool = PageLockedMemoryPool()
x_ptr = pool.allocate(dest.shape , np.float32)
--


If I use
np.intp(x.base.get_device_pointer())
and
drv.memcpy_dtoh(a_gpu, x_ptr)

there is an error message

"BufferError: Object is not writable."

Kind regards,
Joe
___
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda