Re: [PyCUDA] Sum along axis of GPUarray?

2015-05-21 Thread Andreas Kloeckner
Luke Pfister lpfis...@illinois.edu writes: Is there a suggested way to do the equivalent of np.sum along a particular axis for a high-dimensional GPUarray? I saw that this was discussed in 2009, before GPUarrays carried stride information. Hand-writing a kernel is probably still your best

Re: [PyCUDA] Sum along axis of GPUarray?

2015-05-21 Thread Andreas Kloeckner
Jerome Kieffer jerome.kief...@esrf.fr writes: On Thu, 21 May 2015 07:59:35 -0400 Andreas Kloeckner li...@weasel.tiker.net wrote: Luke Pfister lpfis...@illinois.edu writes: Is there a suggested way to do the equivalent of np.sum along a particular axis for a high-dimensional GPUarray

Re: [PyCUDA] Question about managing contexts

2015-05-11 Thread Andreas Kloeckner
Alex Park a...@nervanasys.com writes: Thank you for the response. As a followup question, I was looking at the underlying code for ipc_mem_handle, and it seems like when a handle is deleted, it tries to do a mem_free on the underlying device pointer. So could there not be a situation as

Re: [PyCUDA] Question about managing contexts

2015-05-11 Thread Andreas Kloeckner
Alex Park a...@nervanasys.com writes: Not sure if its sufficiently tested for other peoples' usage, but deleting /src/cpp/cuda.hpp: line 1624 seemed to solve my problems. Logic here being that the memory will be freed inside the process that allocated the memory when the object from

Re: [PyCUDA] How to debug cuMemcpyDtoH failed

2015-05-08 Thread Andreas Kloeckner
Lu, Xinghua xing...@pitt.edu writes: I am new to pyCuda, and I would appreciate your help in advance. I were able to write a few short pyCuda code but run into a roadblock with one at my hand. The code snippet is as follows: tumorLnFScore = np.zeros((nTumorMutGenes,

Re: [PyCUDA] Question about managing contexts

2015-05-01 Thread Andreas Kloeckner
Alex Park a...@nervanasys.com writes: Hi, I'm trying to use multiple gpus with mpi and ipc handles instead of the built-in mpi primitives to p2p communication. I think I'm not quite understanding how contexts should be managed. For example, I have two versions of a toy example to try out

Re: [PyCUDA] prepare function arguments

2015-04-19 Thread Andreas Kloeckner
Ananth Sridharan ana...@umd.edu writes: Can someone shed some light on the arguments for the prepare used by pycuda? I have been unable to find a set of examples to help understand what the arguments are supposed to look like. On the official website,

Re: [PyCUDA] PyCuda with Cuda 7.0 on Windows 8.1 installation issues, no module named compyte.dtypes

2015-04-07 Thread Andreas Kloeckner
Jannes Nagel jannes.j.na...@gmail.com writes: Hi! I just installed pycuda on my system. I have Windows 8.1 and a GTX970 in my Notebook. Therefore I am using Cuda 7 as the only compatible Cuda Version with my system. I ran into 2 problems and I hope someone can help me. 1: If I enable

Re: [PyCUDA] multiple source modules

2015-03-30 Thread Andreas Kloeckner
Jerome Kieffer jerome.kief...@esrf.fr writes: On Mon, 30 Mar 2015 12:46:42 -0400 Ananth Sridharan ana...@umd.edu wrote: I have a simulation code which requires the use of multiple kernels. Each of these kernels (global functions) needs to call a common set of device functions. To organize

Re: [PyCUDA] PyCUDA test_cumath.py fails on cosh with complex number

2015-03-03 Thread Andreas Kloeckner
that Andreas Kloeckner provides on his website. (http://mathema.tician.de/software/pycuda/) Running the tests provided by that ZIP file goes all well except for the test_cumath.py file. I receive the following error: E AssertionError: (2.3841858e-06, 'cosh', type 'numpy.complex64') E assert built

Re: [PyCUDA] gpuarray functions in sourcemodules

2015-02-20 Thread Andreas Kloeckner
Fil Peters fil.pet...@yandex.com writes: Thanks for the answer, it is a pity that you it is not possible to use this functions, especially since it also seems not possible to use the cublas functions in the source modules. In order to be able to use the gpu array functions in a large loop

Re: [PyCUDA] Precompiling kernels?

2015-02-19 Thread Andreas Kloeckner
samie abdul fas...@yahoo.de writes: Hi, is it possible to precompile the invoked kernels beforehand? My code makes use of several CUDA kernels, which are basically called within a fit function. Profiling the code with cProfile yields: 42272 function calls (42228 primitive calls) in 1.662

Re: [PyCUDA] gpuarray functions in sourcemodules

2015-02-17 Thread Andreas Kloeckner
Fil Peters fil.pet...@yandex.com writes: Hello, I am just new to pycuda and started testing it. I was wondering if it is possible to use the gpuarray functions in a sourcemodule. For example, I was trying to covert the following code into a pycuda sourcemodule: numpy code:

Re: [PyCUDA] Newbie - a book?

2015-02-14 Thread Andreas Kloeckner
Alessandro, Alessandro Barracco bomastu...@gmail.com writes: Hi all, I'm a newbie to CUDA and looking for python found two alternative: Anaconda and pyCUDA. I decided to use pyCUDA but I need to read a good book to understand CUDA. I found several book on the topics but it seems that each use

[PyCUDA] ARRAY'15: Workshop on Libraries, Languages and Compilers for Array Programming

2015-02-09 Thread Andreas Kloeckner
Hi all, I would like to draw your attention to a workshop that might be of interest to at least some of you: http://www.sable.mcgill.ca/array/ The point of the workshop is to provide a forum to discuss tools, abstractions, and languages for high-performance computation on arrays. Much of what

Re: [PyCUDA] (no subject)

2015-01-19 Thread Andreas Kloeckner
David A. Markowitz david.a.markow...@gmail.com writes: Many thanks Andreas, I've solved the problem now. While digging through the compiler.py code, I noticed a check for the PYCUDA_DEFAULT_NVCC_FLAGS environment variable, which is then passed to nvcc. Ultimately I was able to solve my

Re: [PyCUDA] (no subject)

2015-01-19 Thread Andreas Kloeckner
David A. Markowitz david.a.markow...@gmail.com writes: Thanks again, Andreas. I'm really looking forward to getting started with PyCUDA. Unfortunately, I've already tried your suggested approach (updating nvcc.profile with NVVMIR_LIBRARY_DIR = /usr/local/cuda-6.5/nvvm/libdevice, which

Re: [PyCUDA] (no subject)

2015-01-18 Thread Andreas Kloeckner
David A. Markowitz david.a.markow...@gmail.com writes: Hi, thanks for the quick reply (and good advice!). I wiped my cuda 6.5 installation and reinstalled from scratch. nvcc now works when called from the command line on simple CUDA samples. It compiles for my GPU's architecture (3.5) by

Re: [PyCUDA] Error running PyCUDA Tests

2015-01-17 Thread Andreas Kloeckner
David A. Markowitz david.a.markow...@gmail.com writes: Hi, I just installed PyCUDA, but test_driver.py crashes with the following error: CompileError: nvcc compilation of /tmp/tmpNht4bp/kernel.cu failed [command: nvcc --cubin -arch sm_35

Re: [PyCUDA] pycuda and intersection of list (or sets) of strings

2014-12-21 Thread Andreas Kloeckner
Luigi, here are a few problems with your approach: - The contents of your SourceModule is not valid C (as in, C the programming language) - 'set' is a Python data structure. PyCUDA will not magically swap out the code of 'set' and execute its operations on the GPU. - Working with arrays of

Re: [PyCUDA] pycuda and intersection of list (or sets) of strings

2014-12-21 Thread Andreas Kloeckner
Luigi Assom luigi.as...@gmail.com writes: Hello Andreas, thank you for your feedback: Which prerequisite must have a data structure to be good for GPU? Should I allocate exact size of memory for each array ? I hate to say it, but let me just state two facts: (1) There's no canned

Re: [PyCUDA] Running test_driver.py, Import Error: cannot import name intern

2014-12-20 Thread Andreas Kloeckner
Donald Osmeyer donald.osme...@outlook.com writes: I just installed Ubuntu 14.04, the Nvidia driver 340.29, cuda version 6.5.12. I tried to install pycuda-2014.1 using the instructions found at http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu Everything seems to install fine. In

Re: [PyCUDA] question about memcpy_atod

2014-12-11 Thread Andreas Kloeckner
Paul Mullowney paulmullow...@gmail.com writes: I've been using PyCuda quite a bit recently. Very nice! I'm trying to use memcpy_atod to message a chunk of CPU data to the GPU where the size/shapes of the input and output arrays don't match (though the size of the data transfer certain does).

Re: [PyCUDA] Frequent crashes after using pycuda

2014-12-10 Thread Andreas Kloeckner
Hi Craig, Craig Stringham string...@mers.byu.edu writes: I keep crashing a server (kernel panic) when using pycuda within ipython. It doesn't seem to matter what kernel I run and it only crashes several minutes after I have run a kernel but have kept the ipython shell open. I am using the

Re: [PyCUDA] Install issue

2014-11-27 Thread Andreas Kloeckner
Mike McFarlane mike.mcfarl...@iproov.com writes: Hi I've installed pycuda following http://wiki.tiker.net/PyCuda/Installation/Linux When I try to run test/test_driver.py it fails many tests, mainly with 'TypeError: 'numpy.ndarray' does not have the buffer interface'. The output is below

Re: [PyCUDA] Problem with PyCUDA on Ubuntu 14.10

2014-11-01 Thread Andreas Kloeckner
Eric Larson larson.eri...@gmail.com writes: PyCUDA worked perfectly on Ubuntu 14.04, but after upgrade to 14.10 I get the following in both Python 2 and Python 3: import pycuda.autoinit Traceback (most recent call last): File stdin, line 1, in module File

Re: [PyCUDA] Question How to Safely Use pycuda with mpi4py

2014-11-01 Thread Andreas Kloeckner
kjs b...@riseup.net writes: Hello, I have written an MPI routine in Python that sends jobs to N worker processes. The root process handles file IO and the workers do computation. In the worker processes calls are made to the cuda enabled GPU to do FFTs. Is it safe to have N processes

Re: [PyCUDA] Advice on Switching between C, IDL and PyCUDA

2014-10-29 Thread Andreas Kloeckner
Lewis, Mcgibbney, Lewis J (398M) lewis.j.mcgibb...@jpl.nasa.gov writes: I DO NOT need to use PyCUDA in stages 1 or 4 e.g. Pre and post processing. What I am looking for is advice on what is ‘common’ practice for NOT reimplementing an entire project (13,000 C and IDL code) in PyCUDA but

Re: [PyCUDA] ReductionKernel.__call__ needs an allocator kwarg

2014-10-16 Thread Andreas Kloeckner
Simon Perkins simon.perk...@gmail.com writes: I've modified the patch to take the existing behaviour into account. Applied to git. Thanks for your contribution! Andreas pgp17RNHRiKnp.pgp Description: PGP signature ___ PyCUDA mailing list

Re: [PyCUDA] petsc4py+PyCUDA through Cython?

2014-10-16 Thread Andreas Kloeckner
Ashwin Srinath ashwinsr...@gmail.com writes: I'm not sure - but this may have something to do with the implementation of `fill`. Because on the flip side, changes to the PETSc Vec *are* reflected the GPUArray. So I can see that they are actually sharing device memory.. As far as I know, PETSc

Re: [PyCUDA] ReductionKernel.__call__ needs an allocator kwarg

2014-10-15 Thread Andreas Kloeckner
Hi Simon, Simon Perkins simon.perk...@gmail.com writes: Here's the patch! The patch looks good. One minor complaint is that in absence of an allocator kwarg, your patch changes existing behavior. Specifically, the allocator that was previously used was the one of the array passed to the

Re: [PyCUDA] petsc4py+PyCUDA through Cython?

2014-10-14 Thread Andreas Kloeckner
Ashwin Srinath ashwinsr...@gmail.com writes: Hello, PyCUDA users! I'm trying to construct a GPUArray from device memory allocated using petsc4py. I've written some C code that extracts a raw pointer from a PETSc cusp vector. Now, I am hoping to 'place' this memory into a gpuarray, using

Re: [PyCUDA] What do I have to do in order to correctly install PyCUDA?

2014-10-08 Thread Andreas Kloeckner
Dear Marco, the easiest thing to do is to have nvcc in your $PATH--that should then enable PyCUDA to automatically find the rest of CUDA. Andreas Marco Ippolito ippolito.ma...@gmail.com writes: Hi all, in my Ubuntu 14.04 I'm trying to install PyCUDA, but I have this error's message:

Re: [PyCUDA] What do I have to do in order to correctly install PyCUDA?

2014-10-08 Thread Andreas Kloeckner
Marco Ippolito ippolito.ma...@gmail.com writes: Hi Andreas, thanks for helping. Following the indications here: https://help.ubuntu.com/community/EnvironmentVariables#Persistent_environment_variables in a brand new file nvcc.sh in /etc/profile.d/ I put: export

Re: [PyCUDA] Shutting down PyCUDA

2014-10-08 Thread Andreas Kloeckner
Thomas Unterthiner thomas_unterthi...@web.de writes: Hi again! How do you completely shut down PyCUDA? After running the following lines: from pycuda import driver as pycuda_drv pycuda_drv.init() device = pycuda_drv.Device(0) ctx = device.make_context() I can see a new

Re: [PyCUDA] Reductions

2014-10-04 Thread Andreas Kloeckner
Freddie Witherden fred...@witherden.org writes: Hi Andreas, On 29/09/14 16:17, Andreas Kloeckner wrote: GPUArrays don't actually care who owns the data, so if you're OK with building a GPUArray as a 'descriptor' structure (which is quick and lightweight) without moving any data around

Re: [PyCUDA] Shutting down/Re-initializing PyCUDA

2014-09-24 Thread Andreas Kloeckner
Thomas Unterthiner thomas_unterthi...@web.de writes: Hi! I have a program that makes extensive use of pycuda, but also calls out to a C library which also uses CUDA internally (it does not share any state or memory with the pycuda code, and uses the CUDA runtime API). However, after the

Re: [PyCUDA] 2nd try: 2014.1 git build, without siteconf.py

2014-08-30 Thread Andreas Kloeckner
Bruce Labitt bdlab...@gmail.com writes: I am trying to install PyCuda from git. I have Ubuntu 14.04, and CUDA6.5 installed. (Driver is 340.19 from Nvidia, CUDA from Nvidia) CUDA examples seem to work, at least the ones supported my my hardware. (3.0) Partially stuck on setting up

Re: [PyCUDA] Plans for 2014.1?

2014-08-17 Thread Andreas Kloeckner
Hi Tomasz, Tomasz Rybak tomasz.ry...@post.pl writes: Dnia 2014-08-12, wto o godzinie 20:56 +0200, Tomasz Rybak pisze: Dnia 2014-08-11, pon o godzinie 15:47 -0500, Andreas Kloeckner pisze: [ cut ] I've just fixed (I think) the last known Py3 bug in PyCUDA, so I think I'll go ahead

Re: [PyCUDA] PyCUDA warning: clean-up operation errors on Kepler

2014-07-30 Thread Andreas Kloeckner
Hi James, James Keaveney james.keave...@durham.ac.uk writes: I'm having an issue with PyCUDA that at first glance seem like they might be similar to those of Thomas Unterthiner (messages from Jun 20 2014, Weird bug when slicing arrays on Kepler cards). I'm also using a Kepler card (GTX

Re: [PyCUDA] How to run the sample code GlInterop.py? Thanks!

2014-07-15 Thread Andreas Kloeckner
LFC liufubu...@gmail.com writes: Dear All, Sorry to interrupt you by this way. When I tried to test the sample code GlInterop.py, I met a problem as below: ---

Re: [PyCUDA] How to run the sample code GlInterop.py? Thanks!

2014-07-15 Thread Andreas Kloeckner
Dear LFC, LFC liufubu...@gmail.com writes: I did the command rm -Rf build and setup.py build and setup.py install. But I sill have the same problem. I don't know why. First, please make sure to keep the list cc'd, for archival. Next, please post a complete build log to some pastebin and

Re: [PyCUDA] Problem with pow

2014-07-14 Thread Andreas Kloeckner
elodw a...@pdauf.de writes: On 11.07.2014 22:14, Andreas Kloeckner wrote: elodw a...@pdauf.de writes: The issue is that you're passing integers. Cast to floating point before you call the function. And another question: Is there a sqrt-Function? http://documen.tician.de/pycuda/array.html

Re: [PyCUDA] Dynamic parallelism (sm_35) with PyCUDA

2014-07-14 Thread Andreas Kloeckner
Hi all, Ahmed Fasih wuzzyv...@gmail.com writes: Hi folks, I write in the hope that someone has gotten a K20 Kepler 3.5 compute capability device and has gotten it to do dynamic parallelism, wherein a kernel can kick off grids on its own without returning to the CPU. A hello world example is

Re: [PyCUDA] Trouble Getting Set Up

2014-07-14 Thread Andreas Kloeckner
Am 14.07.2014 um 16:47 schrieb Forrest Pruitt: The frustrating thing is that in a stand-alone python shell, pycuda behaves appropriately. It is only in a Celery process that things break down. Any help here would be appreciated! If I need to provide any more information, just let me know!

Re: [PyCUDA] Problem with pow

2014-07-11 Thread Andreas Kloeckner
elodw a...@pdauf.de writes: with import pycuda.gpuarray as gpuarray import pycuda.driver as drv import pycuda.autoinit import numpy import sys from pycuda.tools import mark_cuda_test from pycuda.characterize import has_double_support from pycuda.compiler import SourceModule .

Re: [PyCUDA] a beginner wants help

2014-07-08 Thread Andreas Kloeckner
Dear Ernst, elodw a...@pdauf.de writes: i want to start with pycuda and had a very beginner question: ok, the header lines are clear, same to_gpu statement and the get statement, thats it. What is the problem defined in Python: a=array[1000,100] for i in range(1000-1): for j in

Re: [PyCUDA] Development environment

2014-06-29 Thread Andreas Kloeckner
Daniel Pagan daniel.pa...@andphysicsforall.com writes: Thanks for this, Andreas. I will try to profile some simple PyCuda code with the VS profiler as in the above posts, then will experiment to see how/if I can get the debugger to work. Hopefully I will have some results to report. What

Re: [PyCUDA] Development environment

2014-06-20 Thread Andreas Kloeckner
Hi Daniel, Daniel daniel.pa...@andphysicsforall.com writes: I just installed PyCuda and got it working on my Windows 8.1 laptop. I'm able to run the examples in VS 2010. My question concerns the 'preferred' development environment for PyCuda: While it runs on Windows, I'm not able to

Re: [PyCUDA] build/install error - internal compiler error?

2014-06-19 Thread Andreas Kloeckner
JeHoon Song song.je-h...@kaist.ac.kr writes: Hello, I just started to develop PyCUDA application. It build process is not successful as following: ... bpl-subset/bpl_subset/boost/type_traits/detail/cv_traits_impl.hpp:37: internal compiler error: in make_rtl_for_nonlocal_decl, at

Re: [PyCUDA] Passing a custom struct to a kernel by value

2014-05-26 Thread Andreas Kloeckner
Bogdan Opanchuk manti...@gmail.com writes: Hello, Does PyCUDA support struct arguments to kernels? From the Python side it means an element of an array with a struct dtype (a numpy.void object), e.g. dtype = numpy.dtype([('first', numpy.int32), ('second', numpy.int32)]) pair =

Re: [PyCUDA] Passing a custom struct to a kernel by value

2014-05-26 Thread Andreas Kloeckner
Hi Bogdan, Bogdan Opanchuk manti...@gmail.com writes: Thank you for the correction. Just curious, how come in PyOpenCL it works with rank-0 numpy arrays (which, in my opinion, is more intuitive than implicitly casting a rank-1 array to a scalar)? Is it just a difference between PyCUDA and

Re: [PyCUDA] pycuda bug

2014-05-26 Thread Andreas Kloeckner
Dear Danny, Daniel Jeck dje...@jhmi.edu writes: My name is Danny Jeck. I don't really want to subscribe to the mailing list for pycuda, but I thought I should point out to you that the following code creates an error import pycuda.gpuarray as gpuarray import pycuda.driver as cuda

Re: [PyCUDA] PyCUDA pagelocked memory, async and zero-copy

2014-05-25 Thread Andreas Kloeckner
Alexander Bock alexander.asp.b...@gmail.com writes: I am creating some timing tests with PyCUDA for batch-loading an image sequence. I first tried timing a normal, synchronous transfer over global memory. Now I am looking to test pagelocked memory, specifically, I would like to test:

Re: [PyCUDA] Device memory pool

2014-04-19 Thread Andreas Kloeckner
Hi Graham, Graham Mills 13g...@queensu.ca writes: I am attempting to use a memory pool for some gpu array calculations, using PyCUDA 2013.1 with python 3.x and CUDA 5.5. The trouble is I can't find an appropriate integer type with which to call .allocate on a DeviceMemoryPool object. All

Re: [PyCUDA] Can I use cuPrintf in PyCuda?

2014-03-16 Thread Andreas Kloeckner
金陆 ret...@eyou.com writes: I am using an old PC with an old GPU card(GeForce 9800 GT). As you may know, 9800 does not support printf function in device code. However, Nvidia supplies cuPrintf. In CU file, it can be used like following. The situation is how can I use cudaPrintfInit,

Re: [PyCUDA] Assertion Error

2014-02-10 Thread Andreas Kloeckner
Jayanth Channagiri cv.jaya...@hotmail.com writes: Dear all I am having problems with slicing a 3D array into 2D array and then sending it to GPU. For example, array1 = ones((128,128,128)) array1_gpu = gpuarray.to_gpu(array1) #no problem in sending it to GPU But if I convert it to a 2D

Re: [PyCUDA] [PyOpenCL] pyopencl vector op scalar side effect return self instead of new array

2014-02-09 Thread Andreas Kloeckner
Hi István, István Lorentz isti_...@yahoo.com writes: [snip] Note, when working with pure numpy arrays, the results is always in a new copy.  I'm using the regular __div__ operator, not the __idiv__ which I understand should be in-place modifier. I noticed similar optimization for the

Re: [PyCUDA] Buffer Interface for Device Allocations

2014-01-31 Thread Andreas Kloeckner
kkFreddie Witherden fred...@witherden.org writes: Thank you for this. However, toying around with the following example: [snip] with OpenMPI 1.7.3 (running as mpirun -n 2 python file.py) I find that the version using cubuf.as_buffer fails with a segmentation fault due to invalid

Re: [PyCUDA] Buffer Interface for Device Allocations

2014-01-30 Thread Andreas Kloeckner
Freddie Witherden fred...@witherden.org writes: This came up a while ago on the mpi4py mailing list [1] and with CUDA 6 brining unified virtual memory it may become more important in the future. It would be nice if PyCUDA device allocations provided a method for creating a suitable Python

Re: [PyCUDA] string error handling in compiler.py

2013-12-22 Thread Andreas Kloeckner
Hi Alex, Alex Nitz alex.n...@ligo.org writes: I've noticed that a change made several months ago to the string error handling isn't compatible with versions of python earlier than 2.7. The following fails on python versions 2.7. s = test string s.decode(UTF8, error='replace') as

Re: [PyCUDA] texture address modes WRAP and MIRROR not working

2013-12-14 Thread Andreas Kloeckner
Dear Ben, Rowland Ben rowland@claudiusregaud.fr writes: I am using PyCUDA to render slices out of a 3D texture which are then passed to an OpenGL PBO for display on the screen. Everything is going well except that I cannot get my texture to use the address modes WRAP or MIRROR correctly,

Re: [PyCUDA] Passing float3 (or other vector types) to prepared_call CUDA kernels

2013-11-29 Thread Andreas Kloeckner
Rowland Ben rowland@claudiusregaud.fr writes: Thanks for such a quick fix. Can you tell me what the best way is to upgrade my install and test this out? Just follow the regular from-source install instructions here: http://wiki.tiker.net/PyCuda/Installation HTH, Andreas

Re: [PyCUDA] Passing float3 (or other vector types) to prepared_call CUDA kernels

2013-11-29 Thread Andreas Kloeckner
Tomasz Rybak tomasz.ry...@post.pl writes: Dnia 2013-11-29, pią o godzinie 10:40 -0600, Andreas Kloeckner pisze: Rowland Ben rowland@claudiusregaud.fr writes: Thanks for such a quick fix. Can you tell me what the best way is to upgrade my install and test this out? Just follow

Re: [PyCUDA] Passing float3 (or other vector types) to prepared_call CUDA kernels

2013-11-28 Thread Andreas Kloeckner
Rowland Ben rowland@claudiusregaud.fr writes: That is a seriously rapid response! I think this is the minimal code example that reproduces the problem, probably I am making a mistake somewhere in the syntax: Fixed in git (I think). Thanks for the report, and let me know if you spot

Re: [PyCUDA] Customize compiler cache directory

2013-11-27 Thread Andreas Kloeckner
Hi Andreas, Healther.astro healther.as...@gmail.com writes: I have some kernels which will be used over and over again, but as the compile time of them is pretty high (something like half an hour, for some hundred kernels). I would like to ensure that they are stored permanently on my

Re: [PyCUDA] Passing float3 (or other vector types) to prepared_call CUDA kernels

2013-11-27 Thread Andreas Kloeckner
Hi Ben, Rowland Ben rowland@claudiusregaud.fr writes: Just started working with PyCUDA, and already very taken with it, it makes a whole load of things very simple. Already after a couple of days I have a working program with OpenGL interop using PySide to provide the GUI and CUDA

Re: [PyCUDA] ImportError: libcurand.so.5.0: cannot open shared object file

2013-11-25 Thread Andreas Kloeckner
ggeo gg...@windowslive.com writes: Hello, I did all these and when I try to run test_driver.py it gives me: ExecError: error invoking 'nvcc --version': [Errno 2] No such file or directory /usr/lib64/python2.7/site-packages/pytools/prefork.py:53: ExecError What should I do ? Are you sure

Re: [PyCUDA] Thrust interoperability- cgen/ codepy errors on python3.2

2013-11-25 Thread Andreas Kloeckner
Hi Graham, Graham Mills 13g...@queensu.ca writes: I looked back about a year in the archives and couldn't find anything on this. I just downloaded and built cgen 2013.1.2 and codepy 2013.1.2 today. When using thrust as in the example at http://wiki.tiker.net/PyCuda/Examples/ThrustInterop ,

Re: [PyCUDA] sending a device pointer through cython to cuda C

2013-11-05 Thread Andreas Kloeckner
Rok Roskar ros...@physik.uzh.ch writes: I've got a host-side CUDA library wrapped in Cython and I'd like to use it on a device-side array that I've allocated in python with PyCuda. However, I'm completely at a loss as to how I should pass the device pointers from the python side of things to

Re: [PyCUDA] sending a device pointer through cython to cuda C

2013-11-05 Thread Andreas Kloeckner
Rok Roškar ros...@physik.uzh.ch writes: wow that makes it pretty straightforward, thanks! I'm afraid I'm probably missing something obvious, but is there a similar trick for streams? Nope, sorry. Patches welcome, although having this functionality is somewhat risky: A plain integer

Re: [PyCUDA] peer2peer pycuda

2013-10-28 Thread Andreas Kloeckner
Dear Nicolas, Nicolas LEMERCIER lemer...@igbmc.fr writes: I am currently trying to use peer2peer GPU memory access with pycuda and I face with problems regarding the syntax. My code follow this template: dev1=cuda.Device(1) ctx1=dev1.make_context() dev0=cuda.Device(0)

Re: [PyCUDA] pycuda on optimus enabled cards

2013-10-21 Thread Andreas Kloeckner
Hi Dorin, Dorin Niculescu niculescu_dori...@yahoo.com writes: I have a new ASUS laptop with optimus enabled NVIDIA 750M card and i want to install pycuda on it. I've installed Ubuntu 12.04, Bumblebee+nvidia-319, cuda 5.5 and everything was working great until i've installed pycuda using

Re: [PyCUDA] Bicubic interpolation

2013-10-02 Thread Andreas Kloeckner
Hi Eric, You'll to declare the function 'extern C', then it should work. Andreas Eric Scheffel eric.schef...@nottingham.edu.cn writes: Thanks again for making available the cuda wrapper library for python. It's great to use and helps me a lot in my own research. One problem I am facing at

Re: [PyCUDA] assert texref.get_flags() == 0

2013-09-30 Thread Andreas Kloeckner
Hi Eric, Eric Scheffel eric.schef...@nottingham.edu.cn writes: I noticed some strange behaviour with the most recent version of PyCuda (I think I pulled this from the git repository but am not sure anymore). I am running a loop in which textures continuously have to be rebound using

Re: [PyCUDA] wrapping existing device pointer for use with pycuda

2013-09-27 Thread Andreas Kloeckner
Sam Preston j...@sci.utah.edu writes: Hi all, I would like to use pycuda to write a few kernels to interoperate with a larger cuda/python library I'm already using. I can get the raw device memory address as a python int, and from searching past threads on the mailing list it sounds like I

Re: [PyCUDA] OS X: global name 'GL_PIXEL_PACK_BUFFER_ARB' is not defined

2013-09-13 Thread Andreas Kloeckner
Nathaniel Virgo nathanielvi...@gmail.com writes: $ python GlInterop.py Hit ESC key to quit, 'a' to toggle animation, and 'e' to toggle cuda Traceback (most recent call last): File GlInterop.py, line 136, in display process_image() File GlInterop.py, line 188, in process_image

Re: [PyCUDA] Unable to install PyCUDA on Python 2.7/3.3 on Ubuntu 13.04 with NVIDIA GTX 780

2013-08-16 Thread Andreas Kloeckner
Vivek Saxena spino...@gmail.com writes: This problem was solved by the following commands: sudo ln -s /usr/lib/nvidia-325/libcuda.so /usr/lib/libcuda.so sudo ln -s /usr/lib/nvidia-325/libcuda.so.1 /usr/lib/libcuda.so.1 But I now get a message saying TypeError: No registered converter was

Re: [PyCUDA] pycuda.autoinit hangs?

2013-08-07 Thread Andreas Kloeckner
Hi Trevor, Trevor Cickovski movingpicture...@gmail.com writes: I am running pycuda using Python 2.6.5, Cuda 5.0 on Ubuntu 10. My graphics card is an NVIDIA GTX680. Whenever I do 'import pycuda.autoinit', everything hangs (requires manual kill). I have traced it to this statement in

Re: [PyCUDA] [PyOpenCL] Non-contiguous arrays on GPU

2013-07-26 Thread Andreas Kloeckner
Hi Bogdan, all, Bogdan Opanchuk manti...@gmail.com writes: Both in PyCUDA and PyOpenCL constructors of GPU arrays have ``strides`` keyword parameter, and you can create a non-contiguous array, e.g.: import pyopencl as cl from pyopencl.array import Array import numpy ctx =

Re: [PyCUDA] pycuda.gpuarray.take with complex dtypes

2013-07-24 Thread Andreas Kloeckner
Hi Alex, Alex Nitz alex.n...@ligo.org writes: I've noticed that the 'take' function doesn't seem to work for arrays with complex dtypes (complex64, complex128). I've added to patches that allow this to work. It requires adding texture support for these types. As was done for double

Re: [PyCUDA] Recommended way to prepare ElementwiseKernel?

2013-07-19 Thread Andreas Kloeckner
Hi Michael, Michael McNeil Forbes michael.forbes+pyt...@gmail.com writes: On Jul 18, 2013, at 10:46 PM, Michael McNeil Forbes michael.forbes+pyt...@gmail.com wrote: What is the recommended way of preparing ElementwiseKernel instances for repeated calling on the same GPU arrays for

Re: [PyCUDA] Recommended way to prepare ElementwiseKernel?

2013-07-19 Thread Andreas Kloeckner
Michael McNeil Forbes michael.forbes+pyt...@gmail.com writes: Here is the profile of the slow __call__. All the time is spent in generate_stride_kernel_and_types: Line # Hits Time Per Hit % Time Line Contents ==

Re: [PyCUDA] accessing GPUArray contents with Cython memoryviews

2013-07-18 Thread Andreas Kloeckner
Lev Givon l...@columbia.edu writes: I'm trying to access the memory associated with a GPUArray from within a compiled extension built using Cython's memoryview feature. According to the Cython documentation, it is possible to access C arrays using this feature; however, when I attempt to do so

Re: [PyCUDA] Interface with NumbaPro? (DeviceAllocation wrapper for DeviceNDArray)

2013-07-17 Thread Andreas Kloeckner
Michael McNeil Forbes michael.forbes+pyt...@gmail.com writes: Thats the idea, but the problem I am having is getting the pointer into the correct DeviceAllocation type. What is the type of x.gpudata in the theano example you show? The c++ function claims to expect a ctypes.c_ulonglong, but

[PyCUDA] 2013.1

2013-07-04 Thread Andreas Kloeckner
Hi all, PyOpenCL and PyCUDA 2013.1 just rolled off the assembly line. Release notes here: https://pypi.python.org/pypi/pyopencl http://documen.tician.de/pyopencl/misc.html#version-2013-1 https://pypi.python.org/pypi/pycuda http://documen.tician.de/pycuda/misc.html#version-2013-1 If you notice

[PyCUDA] Slicing vs performance

2013-07-03 Thread Andreas Kloeckner
Hi all, I'm writing to let you know that the initial slicing support in PyCUDA and PyOpenCL has had a slightly unintended performance consequence due to this numpy bug: https://github.com/numpy/numpy/issues/3375 I've written about this in the release notes here:

Re: [PyCUDA] CUDA 5.5RC: Segfault on exit

2013-06-24 Thread Andreas Kloeckner
Søren Rasmussen rissed...@gmail.com writes: Sorry for the late reply - stuff came up. Faulthandler gave me nothing. The driver version is: NVIDIA UNIX x86_64 Kernel Module 319.21 Sat May 11 23:51:00 PDT 2013 Backtrace: #0 0x77def181 in ?? () from /lib64/ld-linux-x86-64.so.2 #1

Re: [PyCUDA] Bug in register_host_memory

2013-06-21 Thread Andreas Kloeckner
Hi Tomasz, Tomasz Rybak tomasz.ry...@post.pl writes: I've been packaging PyCUDA for Debian. I run all the tests to ensure that package works on Python 2 and Python 3. All tests pass except for on from test_driver.py: $ python test_driver.py _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Re: [PyCUDA] CURAND wrapper

2013-06-21 Thread Andreas Kloeckner
Tomasz Rybak tomasz.ry...@post.pl writes: Hello. For some time I've been working on adding new features from CURAND into PyCUDA ( g...@github.com:rybaktomasz/pycuda.git branch curand-41) I have added MRG32k3a, and poisson generation to all existing classes. Branch contains documentation and

Re: [PyCUDA] CUDA 5.5RC: Segfault on exit

2013-06-21 Thread Andreas Kloeckner
Søren Rasmussen rissed...@gmail.com writes: Hi, When using PyCUDA with the cuda 5.5 release candidate, I get a segmentation fault when Python exits. I guess it's a problem somewhere in the cleanup process.. The segfault can be reproduced by running: $ python -c import

Re: [PyCUDA] Timing using GTX TITAN

2013-06-12 Thread Andreas Kloeckner
Pierre Castellani pcastel...@gmail.com writes: I have bought kepler GPU in order to do some numerical calculation on it. I would like to use pyCuda (looks to me the best solution). Unfortunatly when I am running a test like MeasureGpuarraySpeedRandom

Re: [PyCUDA] Compiler switches for Cuda 5.0 and PyCuda

2013-05-30 Thread Andreas Kloeckner
Bob Zigon bob.zi...@gmail.com writes: Hello I am using Ubuntu 11.10 with PyCuda 2012.1 and Cuda 5.0 on a K20c. I have been developing with Cuda for 6 years and Python for 4 weeks. I use the pycuda.compiler class to compile my Cuda code in my Python code. Is there a way I can see all of the

Re: [PyCUDA] How frequently is nvcc called in a Python / PyCuda app?

2013-05-30 Thread Andreas Kloeckner
Dear Bob, bob zigon bob.zi...@gmail.com writes: If a kernel is called from within a python loop, how frequently is nvcc called? If the kernel is essentially static, I would hope that nvcc is called once irregardless of the number of times the loop iterates. On the other hand, if the

Re: [PyCUDA] PyCuda Ubuntu 12.04 issues

2013-05-13 Thread Andreas Kloeckner
albeam alb...@ncsu.edu writes: I'm having issues getting pycuda to run properly. I had issues installing, which I suspect is the root of my issue now. Following these instructions: http://wiki.tiker.net/PyCuda/Installation/Linux/Ubuntu I had to remove the last configuration flag

Re: [PyCUDA] gpuarray, return index of max

2013-04-28 Thread Andreas Kloeckner
Dear Mr. Maze, Mr. Maze v...@madmaze.net writes: I am currently working on an application where I need to retrieve the index of the max value. Is there a way to get the index along with the max of a gpuarray? At the moment I am returning the array back to the host just to locate where the

Re: [PyCUDA] Fwd: Re: CHPC GPU

2013-04-24 Thread Andreas Kloeckner
Dear Malcolum, Malcolm Tobias mtob...@wustl.edu writes: Sorry to bug you, but I run a cluster at Washington Univ. in St. Louis and we recently added some GPU nodes. We have several python users, and one has requested that I install PyCUDA on our system. For the first attempt, I tried

Re: [PyCUDA] Python 3 situation

2013-04-23 Thread Andreas Kloeckner
Tomasz Rybak tomasz.ry...@post.pl writes: Hello. I've pulled latest git version and built PyCUDA on Debian unstable. I've tested on two machines - one with Fermi (GTX 460) and one with ION (9400M). In both cases all tests pass for Python 2.7.3 and only 2 tests fail for Python 3.2. Failing

Re: [PyCUDA] shared memory as next step in performance with ElementwiseKernel

2013-04-21 Thread Andreas Kloeckner
Geoffrey Anderson mrco...@yahoo.com writes: So I've got this program using Elementwise and I want to up the performance one more level.  Nobody to my knowledge has written about using shared memory, but that does not mean it can't be done in an Elementwise program.  How can shared memory be

Re: [PyCUDA] Pycuda lines of code

2013-04-18 Thread Andreas Kloeckner
Ahmed Fasih wuzzyv...@gmail.com writes: Sorry for troubling everyone with this petty question, but I recall reading maybe a couple of years ago how PyCUDA itself consisted of about 1200 lines of C++ code and about half as much Python code. I went looking for this but couldn't find this

<    1   2   3   4   5   6   >