On Mon, Mar 9, 2009 at 3:25 PM, Andreas Klöckner <[email protected]>wrote:
> On Montag 09 März 2009, Chris Heuser wrote:
> > Hey everybody! I am new to CUDA and pycuda, but I am working hard to
> > understand.
> > My question is this:
> >
> > Is there a way for me to use multiple python threads in order to run cuda
> > code on multiple GPUs?
> >
> > I have created several threads, and in each I attempt to create a context
> > for a different cuda device, but I am getting an "invalid context" error
> > when I try to copy an array over.
> > Any suggestions?
>
> Can you post a self-contained (minimal) test case for the failure?
>
> Thanks,
> Andreas
>
>
Sure! I am sorry in advance, since I do not know what I am doing wrong, I
made the testCase as minimalist as possible while still accomplishing the
same error under the same conditions.
Also, once again I am still quite naive with cuda...
Here it is:
### Minimalist test case to accomplish an error:
import pycuda
import pycuda.driver as cuda
import threading
import numpy
#------------------------------------------------------------------------------#
class gpuThread(threading.Thread):
def __init__(self, ID, someArray):
self.ID = ID
self.dev = cuda.Device(self.ID)
self.cntxt = self.dev.make_context()
self.valArray_gpu = cuda.mem_alloc(someArray.nbytes)
cuda.memcpy_htod(self.valArray_gpu, someArray)
print 'succesful from ID#',self.ID
threading.Thread.__init__(self)
def run(self):
outputArray = aKernel(self.valArray_gpu)
#------------------------------------------------------------------------------#
#------------------------------------------------------------------------------#
def aKernel(inputArray_gpu):
sourceModule = '''
__global__ void aFunc(float * out, float * in)
{
int idx = threadIdx.x;
out[idx] = in[idx] + 6;
}
'''
mod = cuda.SourceModule(sourceModule)
aFunc = mod.get_function("aFunc")
outputArray = numpy.zeros((1,512))
outputArray_gpu = cuda.mem_alloc(outputArray.nbytes)
aFunc(outputArray_gpu,
inputArray_gpu,
block=(512,1,1))
cuda.memcpy_dtoh(outputArray, outputArray_gpu)
return outputArray
#------------------------------------------------------------------------------#
#initialize:
cuda.init()
#arbitrary array:
someArray = numpy.ones((1,512)).astype(numpy.float32())
#get number of cuda devices
NUM = cuda.Device.count()
#create threads:
gpuThreadList = []
for i in range(NUM):
aGpuThread = gpuThread(i,someArray)
aGpuThread.start()
gpuThreadList.append(aGpuThread)
###############################################################################
Thanks again for your help!
>>>Chris
_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net