Re: [PyCuda] Multi-GPU and Threading

Chris Heuser Mon, 09 Mar 2009 15:07:07 -0700

On Mon, Mar 9, 2009 at 3:25 PM, Andreas Klöckner <[email protected]>wrote:


> On Montag 09 März 2009, Chris Heuser wrote:
> > Hey everybody! I am new to CUDA and pycuda, but I am working hard to
> > understand.
> > My question is this:
> >
> > Is there a way for me to use multiple python threads in order to run cuda
> > code on multiple GPUs?
> >
> > I have created several threads, and in each I attempt to create a context
> > for a different cuda device, but I am getting an "invalid context" error
> > when I try to copy an array over.
> > Any suggestions?
>
> Can you post a self-contained (minimal) test case for the failure?
>
> Thanks,
> Andreas
>
>
Sure! I am sorry in advance, since I do not know what I am doing wrong, I
made the testCase as minimalist as possible while still accomplishing the
same error under the same conditions.

Also, once again I am still quite naive with cuda...

Here it is:

### Minimalist test case to accomplish an error:

import pycuda
import pycuda.driver as cuda
import threading
import numpy

#------------------------------------------------------------------------------#
class gpuThread(threading.Thread):
    def __init__(self, ID, someArray):

        self.ID = ID
        self.dev = cuda.Device(self.ID)
        self.cntxt = self.dev.make_context()

        self.valArray_gpu = cuda.mem_alloc(someArray.nbytes)
        cuda.memcpy_htod(self.valArray_gpu, someArray)

        print 'succesful from ID#',self.ID

        threading.Thread.__init__(self)


    def run(self):
        outputArray = aKernel(self.valArray_gpu)




#------------------------------------------------------------------------------#

#------------------------------------------------------------------------------#
def aKernel(inputArray_gpu):
    sourceModule = '''
__global__ void aFunc(float * out, float * in)
{
    int idx = threadIdx.x;
    out[idx] = in[idx] + 6;
}
'''
    mod = cuda.SourceModule(sourceModule)

    aFunc = mod.get_function("aFunc")

    outputArray = numpy.zeros((1,512))
    outputArray_gpu = cuda.mem_alloc(outputArray.nbytes)

    aFunc(outputArray_gpu,
          inputArray_gpu,
          block=(512,1,1))
    cuda.memcpy_dtoh(outputArray, outputArray_gpu)

    return outputArray
#------------------------------------------------------------------------------#



#initialize:
cuda.init()


#arbitrary array:
someArray = numpy.ones((1,512)).astype(numpy.float32())


#get number of cuda devices
NUM = cuda.Device.count()


#create threads:
gpuThreadList = []
for i in range(NUM):
    aGpuThread = gpuThread(i,someArray)
    aGpuThread.start()
    gpuThreadList.append(aGpuThread)



###############################################################################

Thanks again for your help!

    >>>Chris

_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Multi-GPU and Threading

Reply via email to