Received from Ashwin Srinath on Wed, Dec 24, 2014 at 02:51:42PM EST: > On Tue, Dec 23, 2014 at 6:45 PM, Lev Givon <[email protected]> wrote: > > > (Not sure if this is more of an mpi4py or a pycuda issue at this point.) > > > > I recently tried running a gist a wrote in the past [1] to test > > communication of > > data stored in GPU memory with pycuda using mpi4py compiled against OpenMPI > > 1.8.* (which contains CUDA support). Using the latest revision (9a70e69) > > compiled against OpenMPI 1.8.4 (which was in turn compiled against CUDA > > 6.5 on > > Ubuntu 14.04.1) and installed in a Python 2.7.6 virtualenv along with > > pycuda > > 2014.1 (also manually compiled against CUDA 6.5), I was able to run the > > gist > > without any problems. However, when I changed line 55 from > > > > x_gpu = gpuarray.arange(100, 200, 10, dtype=np.double) > > > > to > > > > x_gpu = gpuarray.to_gpu(np.arange(100, 200, 10, dtype=np.double)) > > > > the data transfer succeeded but was immediately followed by the following > > error: > > > > [avicenna:32494] *** Process received signal *** > > [avicenna:32494] Signal: Segmentation fault (11) > > [avicenna:32494] Signal code: Address not mapped (1) > > [avicenna:32494] Failing at address: (nil) > > [avicenna:32494] [ 0] > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ba2e8fe2340] > > [avicenna:32494] [ 1] > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f60f5)[0x2ba2fd19b0f5] > > [avicenna:32494] [ 2] > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x20470b)[0x2ba2fd1a970b] > > [avicenna:32494] [ 3] > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x17ac02)[0x2ba2fd11fc02] > > [avicenna:32494] [ 4] > > /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuStreamDestroy_v2+0x52)[0x2ba2fd0eeb32] > > [avicenna:32494] [ 5] > > /opt/openmpi-1.8.4/lib/libmpi.so.1(mca_common_cuda_fini+0x1c3)[0x2ba2f57718a3] > > [avicenna:32494] [ 6] > > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xf5e3e)[0x2ba2f57aee3e] > > [avicenna:32494] [ 7] > > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_component_close+0x19)[0x2ba2f6122099] > > [avicenna:32494] [ 8] > > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_components_close+0x42)[0x2ba2f6122112] > > [avicenna:32494] [ 9] > > /opt/openmpi-1.8.4/lib/libmpi.so.1(+0xd7515)[0x2ba2f5790515] > > [avicenna:32494] [10] > > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3] > > [avicenna:32494] [11] > > /opt/openmpi-1.8.4/lib/libopen-pal.so.6(mca_base_framework_close+0x63)[0x2ba2f612b3c3] > > [avicenna:32494] [12] > > /opt/openmpi-1.8.4/lib/libmpi.so.1(ompi_mpi_finalize+0x56d)[0x2ba2f573693d] > > [avicenna:32494] [13] > > /home/lev/Work/virtualenvs/PYTHON/lib/python2.7/site-packages/mpi4py/MPI.so(+0x2e694)[0x2ba2f53b2694] > > [avicenna:32494] [14] python(Py_Finalize+0x1a6)[0x42fb0f] > > [avicenna:32494] [15] python(Py_Main+0xbed)[0x46ac10] > > [avicenna:32494] [16] > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2ba2e9211ec5] > > [avicenna:32494] [17] python[0x57497e] > > [avicenna:32494] *** End of error message *** > > > > I also tried replacing line 55 with > > > > x_gpu = gpuarray.zeros(10, dtype=np.double) > > x_gpu.set(np.arange(100, 200, 10, dtype=np.double)) > > > > which resulted in no error and > > > > x_gpu = gpuarray.empty(10, dtype=np.double) > > x_gpu.set(np.arange(100, 200, 10, dtype=np.double)) > > > > which resulted in the same error as mentioned earlier. > > > > Any ideas as to what could be going on? > > > > [1] https://gist.github.com/8514d3456a94a6c73e6d > > Hi Lev, > > This code worked for me (even after changing line 55 to use > 'gpuarray.to_gpu(np.arange...'). I'm on an environment very similar to > yours.
Did the code run without error on your system after modifying line 55 even without MPI.Finalize() added to the end of the code? > Just a couple of suggestions: > > 1. Insert MPI.Finalize() at the end of your code. > 2. If you're not already, pass the parameter '--mca pml ob1' to your > mpiexec command. Adding the call to MPI.Finalize() made the error go away even when using gpuarray.to_gpu(); adding the extra mca parameters didn't appear to have any effect. My understanding is that the call to MPI.Finalize() should be automatically registered to be executed when the processes exit; this makes me wonder whether my explicitly registering the pycuda method that cleans up the current context is causing problems. I'll see what the folks on the mpi4py list have to say. Thanks, -- Lev Givon Bionet Group | Neurokernel Project http://www.columbia.edu/~lev/ http://lebedov.github.io/ http://neurokernel.github.io/ _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
