Dear Jackin, On Sat, 31 Jul 2010 15:05:02 +0900, jackin <jac...@opt.utsunomiya-u.ac.jp> wrote: > Sorry for the delay, I had some thermal problems with my GPU > machine. > I finally got PYCUDA working with Multiprocessing. As you > correctly pointed out, I had to call pycuda.driver.init() inside all the > spawned process. This is because Multiprocessing does not allow any > sharing between process. So I had to use init() only inside def > run(self) and no where else. I don't know whether I am correct with this > statement or not, but that is how it worked for me. > > I am sending you the modified examples/multiple_threads.py > that works with multiprocessing. > > I am not a professional programmer and hence I would encourage > any one to correct if there is any bad programming practice in this code > or any thing wrong with my statements.
Three comments: - First, cuda.init() should be idempotent, i.e. no harm in calling it twice. I'm guessing you can therefore still dynamically adapt to the number of CUDA devices in the machine. - Please keep any PyCUDA-related mail on-list. - You may post this (and the other--highly interesting, IMO) example you sent to the PyCUDA examples wiki yourself--no need to ask me first. If it does something useful for you, in all likelihood someone else will find it useful, too. To do so, go to http://wiki.tiker.net/PyCuda/Examples enter a new example name and paste your example code. Thanks for your useful contributions! Andreas > #THIS IS MODIFIED TO USE MULTIPROCESSING INSTEAD OF MULTI THREAD > > # Derived from a test case by Chris Heuser > # Also see FAQ about PyCUDA and threads. > > > import pycuda > import pycuda.driver as cuda > from pycuda.compiler import SourceModule > import multiprocessing > import numpy > > class GPUProcess(multiprocessing.Process): > def __init__(self, number, some_array): > multiprocessing.Process.__init__(self) > self.number = number > self.some_array = some_array > > def run(self): > cuda.init() # INITIALISE HERE > self.dev = cuda.Device(self.number) > self.ctx = self.dev.make_context() > > self.array_gpu = cuda.mem_alloc(some_array.nbytes) > cuda.memcpy_htod(self.array_gpu, some_array) > > test_kernel(self.array_gpu) > print "successful exit from thread %d" % self.number > self.ctx.pop() > > del self.array_gpu > del self.ctx > > def test_kernel(input_array_gpu): > mod = SourceModule(""" > __global__ void f(float * out, float * in) > { > int idx = threadIdx.x; > out[idx] = in[idx] + 6; > } > """) > func = mod.get_function("f") > output_array = numpy.zeros((1,512)) > output_array_gpu = cuda.mem_alloc(output_array.nbytes) > func(output_array_gpu,input_array_gpu,block=(512,1,1)) > cuda.memcpy_dtoh(output_array, output_array_gpu) > return output_array > > > #cuda.init() #COMMENT HERE > some_array = numpy.ones((1,512), dtype=numpy.float32) > #num = cuda.Device.count() #COMMENT HERE > num = 2 > > gpu_process_list = [] > for i in range(num): > gpu_process = GPUProcess(i, some_array) > gpu_process.start() > gpu_process_list.append(gpu_process) >
pgp2VMoypI54a.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda