Hi all,

I'm having a problem using multiprocessing and pycuda on linux. I guess I'm getting something fairly basic wrong, but couldn't find any documentation. I expected the example below to work (and it does work on Windows), but on Linux nothing happens, and nvcc gets stuck using 100% processor usage and never returning:

from numpy import ones, int32
import multiprocessing
import pycuda
import pycuda.autoinit as autoinit
import pycuda.driver as drv
import pycuda.gpuarray as gpuarray
from pycuda.compiler import SourceModule

def doit(x):

    code = '''
    __global__ void test(double *x, int n)
     int i = blockIdx.x * blockDim.x + threadIdx.x;
     if(i>=n) return;
     x[i] *= 2.0;

    mod = SourceModule(code)
    f = mod.get_function('test')
    x = gpuarray.to_gpu(ones(100))
    f(x, int32(100), block=(100,1,1))
    return x.get()

if __name__=='__main__':

#    doit(1)

    pool = multiprocessing.Pool(1)
    result = pool.map(doit, [0])
    print result

I'm wondering if it has something to do with the way that multiprocessing uses forking on Linux but not on Windows?

Thanks for any suggestions!

Btw, I'm on 64 bit Linux, GTX 280, and get the same problem with OpenSUSE and Ubuntu.


