Looking through the pycuda source, it seems like the GIL is being held when performing some of the memcpy operations: memcpy_htod, memcpy_dtoh, and their async counterparts.
We noticed that host copies were taking a quite a bit of time and were hoping to be able to run some background operations while they are in flight. Is this on purpose, or could we safely change them to release the GIL? I was alternatively thinking of doing: x = ndarray(...) GPUArray.to_gpu_async(x, stream) while not stream.is_done(): time.sleep(0) But this is a bit convoluted. Thanks, -- R
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
