[PyCUDA] Slicing

2011-02-23 Thread Thomas Wiecki
Hi, I want to run element-wise computations on different parts of an array. Loading each part of the array to device mem when needed turned out to use up a lot of time and not really speed things up compared to cpu. Instead, I want to once load the data array into device mem and provide pointers

Re: [PyCUDA] Slicing

2011-06-28 Thread Thomas Wiecki
24, 2011 at 10:29 AM, Andreas Kloeckner li...@informa.tiker.net wrote: On Thu, 24 Feb 2011 06:12:22 -0500, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi Lev, thanks -- I wasn't aware GPUArray supported any slicing at all. However, my slice index is boolean which still does not seem

Re: [PyCUDA] Double is not supported. Demoting to float

2011-06-29 Thread Thomas Wiecki
at 11:22 AM, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi, I always got these warnings while compiling, but now it seems they lead to a compile error: pycuda.driver.CompileError: nvcc said it demoted types in source code it compiled--this is likely not what you want. My function

[PyCUDA] ElementWise kernel only operates on half of the array: Bug?

2011-06-29 Thread Thomas Wiecki
This is with the version from the trunk (7804dc6d1b40b506b02a5f7a0b7bde8771f1446c). import pycuda.driver as cuda import pycuda.compiler import pycuda.autoinit import pycuda.gpuarray as gpuarray from pycuda.elementwise import ElementwiseKernel zero_kernel = ElementwiseKernel( float *out,

Re: [PyCUDA] ElementWise kernel only operates on half of the array: Bug?

2011-06-29 Thread Thomas Wiecki
...@informa.tiker.net wrote: On Wed, 29 Jun 2011 12:56:09 -0400, Thomas Wiecki thomas_wie...@brown.edu wrote: This is with the version from the trunk (7804dc6d1b40b506b02a5f7a0b7bde8771f1446c). import pycuda.driver as cuda import pycuda.compiler import pycuda.autoinit import pycuda.gpuarray

Re: [PyCUDA] ElementWise kernel only operates on half of the array: Bug?

2011-06-30 Thread Thomas Wiecki
On Thu, Jun 30, 2011 at 12:02 AM, Andreas Kloeckner li...@informa.tiker.net wrote: On Wed, 29 Jun 2011 15:21:13 -0400, Thomas Wiecki thomas_wie...@brown.edu wrote: As regards to parameter input checking, would it be possible to have a switch for type-checking as an argument to ElementWise

Re: [PyCUDA] curandom, access to generator

2011-12-08 Thread Thomas Wiecki
On Thu, Dec 8, 2011 at 3:37 PM, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi, I want to simulate many noisy brownian motion particles. So for each particle I have to sum up random numbers repeatedly. I figured I'd create a function that simulates one particle movement in cuda c

Re: [PyCUDA] curandom, access to generator

2011-12-08 Thread Thomas Wiecki
curand_normal_double().  If you compile for an architecture that does not support doubles, you get this harmless warning. On Dec 8, 2011, at 8:41 PM, Thomas Wiecki wrote: Thanks, that worked like a charm! I am trying to save back the state of the generator, like this (copied it from

Re: [PyCUDA] curandom, access to generator

2011-12-08 Thread Thomas Wiecki
:37 PM, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi, I want to simulate many noisy brownian motion particles. So for each particle I have to sum up random numbers repeatedly. I figured I'd create a function that simulates one particle movement in cuda c and import it to pycuda via

Re: [PyCUDA] curandom, access to generator

2011-12-09 Thread Thomas Wiecki
The problem is that I don't want an array of random numbers in the end (as your code does), but want to pass generators to a cuda function that then simulates a stochastic process. So I need access to the curandState. On Fri, Dec 9, 2011 at 5:38 AM, Tomasz Rybak bogom...@post.pl wrote: Excuse me

Re: [PyCUDA] curandom, access to generator

2011-12-09 Thread Thomas Wiecki
() creates the maximum number of curandStates? What happens when I call my function with idx being greater than this number? On Fri, Dec 9, 2011 at 8:12 AM, Tomasz Rybak bogom...@post.pl wrote: Dnia 2011-12-09, pią o godzinie 08:01 -0500, Thomas Wiecki pisze: The problem is that I don't want

Re: [PyCUDA] curandom, access to generator

2011-12-09 Thread Thomas Wiecki
Is there a variable that will tell me number of available threads? On Fri, Dec 9, 2011 at 11:36 AM, Tomasz Rybak bogom...@post.pl wrote: Dnia 2011-12-09, pią o godzinie 11:06 -0500, Thomas Wiecki pisze: That does seem to work. Its actually what I initially thought of doing (but didn't know

Re: [PyCUDA] curandom, access to generator

2011-12-12 Thread Thomas Wiecki
will try my hands at a doc patch but am not sure when I can get to it. Thomas On Sun, Dec 11, 2011 at 11:18 PM, Andreas Kloeckner li...@informa.tiker.net wrote: On Thu, 8 Dec 2011 20:57:22 -0500, Thomas Wiecki thomas_wie...@brown.edu wrote: OK, but it's bailing on this warning, any way I can

[PyCUDA] Error with uint32

2011-12-13 Thread Thomas Wiecki
When running the GPUArray unittest, almost all of them (37/45) fail. The most common error message is: ValueError: unable to map dtype 'uint32' Not sure what the problem is. I am happy to provide the full log on request. Most recent pycuda under kubuntu 11.10.

Re: [PyCUDA] Error with uint32

2011-12-14 Thread Thomas Wiecki
DTYPE_TO_NAME[dtype] *** KeyError: dtype('uint32') !?, Thomas On Tue, Dec 13, 2011 at 8:49 PM, Thomas Wiecki thomas_wie...@brown.edu wrote: I updated the submodules and reinstalled, but am still get the same errors. On Tue, Dec 13, 2011 at 7:09 PM, Andreas Kloeckner li...@informa.tiker.net wrote

Re: [PyCUDA] Error with uint32

2011-12-14 Thread Thomas Wiecki
1.5.1, the ubuntu oneiric default it seems: http://packages.ubuntu.com/oneiric/python-numpy On Wed, Dec 14, 2011 at 9:33 AM, Andreas Kloeckner li...@informa.tiker.net wrote: On Wed, 14 Dec 2011 09:15:19 -0500, Thomas Wiecki thomas_wie...@brown.edu wrote: This is getting very weird. I went

[PyCUDA] FYI Nvidia Opens CUDA Platform, Releases Compiler Source Code

2011-12-14 Thread Thomas Wiecki
http://developer.nvidia.com/content/cuda-platform-source-release ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] curandom, access to generator

2011-12-18 Thread Thomas Wiecki
at it longer I could figure out what is going on. Also, is there any way using this module to specify the number of states you want? On Mon, Dec 12, 2011 at 5:45 AM, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi Andreas, yes, the warning is still an issue. I just removed the raised exception

Re: [PyCUDA] [Robert Kern] Re: [Numpy-discussion] dtype comparison, hash

2012-01-12 Thread Thomas Wiecki
...@informa.tiker.net wrote: On Fri, 30 Dec 2011 20:03:44 +0100, Thomas Wiecki thomas_wie...@brown.edu wrote: Hi Andreas, glad to see that you followed up on this issue. I will try to boil it down but noticed that when I was investigating the issue back then I could not easily reproduce

Re: [PyCUDA] [Robert Kern] Re: [Numpy-discussion] dtype comparison, hash

2012-01-12 Thread Thomas Wiecki
Seems like it is a 32 bit bug, I replicated it on another 32 bit machine and filed a bug report: http://projects.scipy.org/numpy/ticket/2017 As for a temporary fix, I also register uintp32 (and intp32 for good luck) to DTYPES_TO_CTYPES which seems to do the trick (on 64 bit it will just overwrite

Re: [PyCUDA] curandom, access to generator

2012-01-28 Thread Thomas Wiecki
on why that might be or how to investigate this further? Thanks, Thomas On Thu, Jan 12, 2012 at 3:44 PM, Tomasz Rybak tomasz.ry...@post.pl wrote: On Sun, 2011-12-18 at 20:25 +0100, Thomas Wiecki wrote: I think it just allocates the maximum number. Previously I wondered how I could find

Re: [PyCUDA] Error with uint32

2012-02-01 Thread Thomas Wiecki
Hi Peter, this has been fixed (or rather, worked around at) a little bit more cleanly in the version in the git repo to which you might want to upgrade to. Thomas On Wed, Feb 1, 2012 at 12:28 PM, Peter Rösch peter.roe...@hs-augsburg.de wrote: I'm as confused as you. Can you go up the call

Re: [PyCUDA] [Robert Kern] Re: [Numpy-discussion] dtype comparison, hash

2012-03-12 Thread Thomas Wiecki
: = fixed * milestone: Unscheduled = 1.7.0 Comment: Fixed in 39029f5..bb7e5e2 On Tue, Jan 17, 2012 at 2:29 AM, Andreas Kloeckner li...@informa.tiker.net wrote: Great, thanks! Andreas On Tue, 17 Jan 2012 08:07:19 +0100, Thomas Wiecki thomas_wie...@brown.edu wrote: Not following that list

Re: [PyCUDA] Histograms with PyCUDA

2012-04-06 Thread Thomas Wiecki
Do you mind posting the final code here for future reference (as a gist perhaps)? Also, another optimization might be to remove the (slow) sqrt() in each distance calculation and then do sqrt() of the bin labels in the reduction step. On Fri, Apr 6, 2012 at 3:56 AM, Francisco Villaescusa Navarro

Re: [PyCUDA] undefined symbol: cuMemAllocPitch_v2

2012-05-22 Thread Thomas Wiecki
On Tue, May 22, 2012 at 6:31 PM, Andreas Kloeckner li...@informa.tiker.net wrote: On Tue, 22 May 2012 18:21:56 -0400, Thomas Wiecki thomas_wie...@brown.edu wrote: On Tue, May 22, 2012 at 4:40 PM, Andreas Kloeckner li...@informa.tiker.net wrote: On Tue, 22 May 2012 15:56:33 -0400, Thomas

Re: [PyCUDA] curandom, access to generator

2012-05-22 Thread Thomas Wiecki
On Mon, Jan 30, 2012 at 8:21 PM, Andreas Kloeckner li...@informa.tiker.net wrote: On Sat, 28 Jan 2012 18:21:29 -0500, Thomas Wiecki thomas_wie...@brown.edu wrote: I am currently revisiting this but having some problems with the random number generator. generator.generators_per_block is 512

[PyCUDA] 'function_param_set_pre_v4' is not defined

2012-05-23 Thread Thomas Wiecki
Hi, I get: Traceback (most recent call last): File sim_drift_gpu.py, line 4, in module import pycuda.gpuarray as gpuarray File /usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/gpuarray.py, line 3, in module import pycuda.elementwise as elementwise

[PyCUDA] CodePy: No registered converter

2012-05-25 Thread Thomas Wiecki
Sorry for posting to PyCuda, not sure where else this would fit. The CodePy unittests give me: == ERROR: Failure: TypeError (No registered converter was able to produce a C++ rvalue of type unsigned long long from this Python

Re: [PyCUDA] Compiling thrust code in pyCUDA

2012-05-26 Thread Thomas Wiecki
I can't get the CodePy Thrust example to work: Traceback (most recent call last): File thrust_demo.py, line 95, in module c = module.host_entry(b) TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type DeviceAllocation I

Re: [PyCUDA] Compiling thrust code in pyCUDA

2012-05-26 Thread Thomas Wiecki
involve editing your aksetup defaults file. - bryan On May 26, 2012, at 9:33 AM, Thomas Wiecki thomas_wie...@brown.edu wrote: I can't get the CodePy Thrust example to work: Traceback (most recent call last):  File thrust_demo.py, line 95, in module    c = module.host_entry(b) TypeError

Re: [PyCUDA] Compiling thrust code in pyCUDA

2012-05-27 Thread Thomas Wiecki
, Apostolis 2012/5/26 Andreas Kloeckner li...@informa.tiker.net On Sat, 26 May 2012 14:59:28 -0400, Thomas Wiecki thomas_wie...@brown.edu wrote: I tried using the shipped version (bpl_subset) but couldn't get it to work somehow (how is one supposed to get the lib files?). I now set

Re: [PyCUDA] Compiling thrust code in pyCUDA

2012-05-27 Thread Thomas Wiecki
and fix it as soon as i have more time in my hands. Thanks for the help anyway. Apostolis 2012/5/27 Thomas Wiecki thomas_wie...@brown.edu On Sun, May 27, 2012 at 1:25 PM, Apostolis Glenis apostgle...@gmail.com wrote: After google searching i found no -lboost_python-gcc43-mt so I suspect

[PyCUDA] Array larger than number of threads

2012-05-29 Thread Thomas Wiecki
Hi, I saw a couple of times the following idiom being used: const int tidx = blockIdx.x*blockDim.x + threadIdx.x; const int delta = blockDim.x*gridDim.x; curandState local_state = global_state[tidx]; for (int idx = tidx; idx n; idx += delta) {

[PyCUDA] curandom not initializing all generators?

2012-06-07 Thread Thomas Wiecki
Hi, the curandom Generator class initializes generators_per_block number of generators. This is the relevant code: @property @memoize_method def generators_per_block(self): return min(kernel.max_threads_per_block for kernel in self._kernels()) On my machine

Re: [PyCUDA] Compiling thrust code in pyCUDA

2012-06-08 Thread Thomas Wiecki
As this seems to be the codepy/cgen thread I thought I'd tack this on here. I want to port thrust code that is a little bit more involved than the sort example. Namely the example code for summary statistics ( http://code.google.com/p/thrust/source/browse/examples/summary_statistics.cu ) I think