[PyCUDA] LaunchError: cuModuleLoadDataEx failed: launch failed -

2012-05-14 Thread Eli Stevens (Gmail)
I've seen this error a few times, but it's not reproducible. Can anyone give any insight into what might be going wrong? Traceback (most recent call last): File /home/elis/edit/work/dev/mms/common/util/threads.py, line 219, in run mod = cudahelper.compileSourceModule(kernel.code_str,

Re: [PyCUDA] install issue on OS X 10.7.3

2012-05-01 Thread Eli Stevens (Gmail)
I recall the pci bus ID behavior changing between... 4.0 and 4.1 maybe? It swapped from being a property-like thing to being a callable, with only changing the underlying cuda version (no PyCUDA update). What version of cuda are you using? Eli On Tue, May 1, 2012 at 11:50 AM, Serge Rey

Re: [PyCUDA] Any interest in a PyCUDA stack exchange?

2012-03-15 Thread Eli Stevens (Gmail)
Slightly off-topic: what's the benefit of having a dedicated stack exchange, vs. using Stack Overflow? I can only think of downsides related to fracturing the existing set of answers and responders, esp. for ones that aren't directly related to PyCUDA. On topic: I don't see a reason to not just

Re: [PyCUDA] Event lifetime, cross-thread use

2012-03-13 Thread Eli Stevens (Gmail)
Cool, thanks. As a followup, the threads are working fine, but we have an off-by-one error on some array accesses, which is probably related to why sometimes things terminate and sometimes don't (uninitialized/arbitrary memory). Thanks for the help! Eli On Fri, Mar 9, 2012 at 5:50 PM, Andreas

Re: [PyCUDA] Event lifetime, cross-thread use

2012-03-09 Thread Eli Stevens (Gmail)
Thanks for all of the pointers. I'm going to hold off on MPI for now; I'm hesitant to add additional dependencies unless they're really needed. The threading solution as outlined here: http://stackoverflow.com/questions/5904872/python-multiprocessing-with-pycuda Seems to be working well,

Re: [PyCUDA] Event lifetime, cross-thread use

2012-03-09 Thread Eli Stevens (Gmail)
, Eli Stevens (Gmail) wickedg...@gmail.com wrote: Thanks for all of the pointers.  I'm going to hold off on MPI for now; I'm hesitant to add additional dependencies unless they're really needed. The threading solution as outlined here: http://stackoverflow.com/questions/5904872/python

[PyCUDA] Event lifetime, cross-thread use

2012-03-08 Thread Eli Stevens (Gmail)
Hello, I was wondering if the following will work: - Main thread spins up thread B. - Thread B creates a context, invokes a kernel, and creates an event. - Event is saved. - Thread B pops the context (kernel is still running at this point) and finishes. - Main thread join()s B and grabs the

Re: [PyCUDA] Event lifetime, cross-thread use

2012-03-08 Thread Eli Stevens (Gmail)
...@informa.tiker.net wrote: #part sign=pgpmime Hi Eli, On Thu, 8 Mar 2012 09:33:21 -0800, Eli Stevens (Gmail) wickedg...@gmail.com wrote: I was wondering if the following will work: - Main thread spins up thread B. - Thread B creates a context, invokes a kernel, and creates an event

Re: [PyCUDA] Fwd: Re: Get sublist with largest length

2011-08-26 Thread Eli Stevens (Gmail)
On Thu, Aug 25, 2011 at 10:33 PM, Francis fccaba...@gmail.com wrote: I could make use of the tens of thousands of threads in CUDA to get the length of each substring/subarray. The python list structure stores the length of the list already (it increments / decrements with appends / pops, etc.),

Re: [PyCUDA] Fwd: Re: Get sublist with largest length

2011-08-25 Thread Eli Stevens (Gmail)
On Thu, Aug 25, 2011 at 2:21 AM, Francis fccaba...@gmail.com wrote: Thanks for the replies @Eli and @David. I suppose given a 'small' enough list of sub-lists doing what I need to do in the host is good enough instead of moving and doing the task in the device. I am just looking out for

Re: [PyCUDA] Fwd: Re: Get sublist with largest length

2011-08-24 Thread Eli Stevens (Gmail)
On Wed, Aug 24, 2011 at 8:20 AM, David Mertens dcmertens.p...@gmail.com wrote: Ah, I see. I think you might be trying to fit a round peg into a square hole, so to speak. I agree. At the very least, you're going to need a loop over each list, calling len() on it (to then stick it into an array

[PyCUDA] How do I diagnose a CUDA launch failure due to being out of resources?

2011-07-31 Thread Eli Stevens (Gmail)
This is a sorta-cross-post from stack overflow: http://stackoverflow.com/questions/6892280/how-do-i-diagnose-a-cuda-launch-failure-due-to-being-out-of-resources I'm getting an out-of-resources error when trying to launch a CUDA kernel (through PyCUDA), and I'm wondering if it's possible to get

[PyCUDA] trying to create a 3d texture

2011-07-12 Thread Eli Stevens (Gmail)
I'm trying to turn a 3d numpy array with float32 data into a texture that I can read via tex3D inside of kernel code. I have this in the kernel: texturefloat, cudaTextureType3D, cudaReadModeElementType my_tex; And I have tried both: my_texref = cuda_module.get_texref(my_tex) my_gpu =