On 30/01/14 22:40, Andreas Kloeckner wrote: > Freddie Witherden <[email protected]> writes: >> This came up a while ago on the mpi4py mailing list [1] and with CUDA 6 >> brining unified virtual memory it may become more important in the future. >> >> It would be nice if PyCUDA device allocations provided a method for >> creating a suitable Python buffer object: e.g, myalloc.asbuf(offset=0, >> sz=...). This would provide a clean way for PyCUDA and mpi4py to >> interoperate (as the wrapped MPI functions expect Python buffer objects >> as opposed to pointers). >> >> Currently one must either resort to writing a C extension or hacks such as: >> >> import ctypes as ct >> >> make_buf = ct.pythonapi.PyBuffer_FromMemory >> make_buf.argtypes = [ct.c_void_p, ct.c_ssize_t] >> make_buf.restype = ct.py_object >> >> cubuf = cuda.mem_alloc(SZ*8) >> pybuf = make_buf(long(cubuf), SZ*8) > > Done: > > http://documen.tician.de/pycuda/driver.html#pycuda.driver.DeviceAllocation.as_buffer
Thank you for this. However, toying around with the following example:
from mpi4py import MPI
import numpy as np
import ctypes as ct
make_buf = ct.pythonapi.PyBuffer_FromMemory
make_buf.argtypes = [ct.c_void_p, ct.c_ssize_t]
make_buf.restype = ct.py_object
comm = MPI.COMM_WORLD
SZ = 100
if comm.rank == 0:
import pycuda.autoinit
import pycuda.driver as cuda
cubuf = cuda.mem_alloc(SZ*8)
pybuf = cubuf.as_buffer(SZ*8)
#pybuf = make_buf(long(cubuf), SZ*8)
cuda.memcpy_htod(cubuf, np.arange(SZ, dtype=np.float64))
comm.Send([pybuf, MPI.BYTE], dest=1, tag=1)
else:
npbuf = np.empty(SZ)
comm.Recv(npbuf, source=0, tag=1)
print npbuf
with OpenMPI 1.7.3 (running as mpirun -n 2 python file.py) I find that
the version using cubuf.as_buffer fails with a segmentation fault due to
invalid permissions whereas the variant using my make_buf hack works as
expected.
I believe the problem is on L1450 of cuda.hpp where:
PyBuffer_FromMemory((void *) (get_pointer() + size), size)));
which should be:
PyBuffer_FromMemory((void *) get_pointer(), size)));
(and analogously on L1498 -- which fixes the above issue) also it may be
better to use PyBuffer_FromReadWriteMemory as this will permit CUDA
allocations to be on the receiving end of MPI communications.
Regards, Freddie.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
