Re: [PyCUDA] nvcc is not searched in cuda root dir.
On Sonntag 09 Mai 2010, Ilya Gluhovsky wrote: On Sonntag 04 Oktober 2009, Michal wrote: Hi, during pycuda configuration it is possible to specify cuda root dir. I think that pycuda should add cuda_root_dir/bin to its PATH, so I wouldn't get errors like : OSError: nvcc was not found (is it on the PATH?) [[Errno 2] No such file or directory] So what the work-around? Thanks so much! Ilya. Having nvcc on the PATH should fix things. Andreas PS: Please keep the list cc'd. Thanks. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
[PyCUDA] Windows binaries
Hi team, I just discovered that Christoph Gohlke at UC Irvine distributes Windows binaries for PyCUDA, here: http://www.lfd.uci.edu/~gohlke/pythonlibs/ This looks like a good page to keep bookmarked if you're on Windows, though obviously I don't know how well the packages on that page actually work. :) I've also added this link to the Wiki. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] Mailing list move
On Donnerstag 06 Mai 2010, Andreas Klöckner wrote: Hi all, just a quick heads-up that I will be moving the PyCUDA list to a different server today. There might be a short period where the list is unavailable, but I'll try to keep this minimal. All should be back to normal by tonight at the latest. If you notice breakage after that, please let me know. The move is done, and everything should be back to working order. DNS changes might still not have propagated everywhere, but should do so soon. Let me know if you notice any issues. Thanks for your patience, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] Why is cumath slower than an ElementwiseKernel? Is data copied back after each operation?
On Donnerstag 06 Mai 2010, Ian Ozsvald wrote: I've been speed testing some code to understand the complexity/speed trade-off of various approaches. I want to offer my colleagues the easiest way to use a GPU to get a decent speed-up without forcing anyone to write C-like code if possible. A few comments: - If you use 100 floats, you only measure launch overhead. The GPU processing time will be entirely negligible. You need a couple million entries to generate even a small bit of load. - You might want to warm up each execution path before you take your timings, to account for code being compiled or fetched from disk. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] Are sin/log supported for complex number (0.94rc)? Odd results...
On Montag 19 April 2010, Ian Ozsvald wrote: I find myself out of my depth again. I'm playing with complex numbers using 0.94rc (on Windows XP with CUDA 2.3). I've successfully used simple operations (addition, multiplication) on complex numbers, that resulted in the Mandelbrot example in the wiki. Now I'm trying sin and log and I'm getting errors and one positive result (see the end). I'm not sure if these functions are supported yet? If anyone can point me in the right direction (and perhaps suggest what needs implementing) then I'll have a go at fixing the problem. For reference, you can do the following using numpy on the CPU for verification: In [117]: numpy.sin(numpy.array(numpy.complex64(1-1j))) Out[117]: (1.2984576-0.63496387j) Here are two pieces of test code. At first I used multiplication (rather than sin()) to confirm that the code ran as expected. Next I tried creating a simple complex64 array, passing it to the GPU and then asking for pycuda.cumath.sin() - this results in Error: External calls are not supported. Here's the code and error: Fixed/added in git, both for cumath and elwise. I replaced: pycuda.cumath.sin(a_gpu) # should produce (1.29-0.63j) with: pycuda.cumath.log(a_gpu) # should produce (0.34-0.78j) and the code ran without an error...but produced the wrong result. It generates (1-1j) which looks like a no-op. cumath ops aren't in-place--you need to print the returned value. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] pycuda and cuda 64 bits on mac SL
On Montag 12 April 2010, Alan wrote: Hi there, Finally nVidia released Cuda (partially) 3.0 in 64 bits for Mac (not beta version!) Hi Alan, all, has there been progress on this? Has anyone gotten 64-bit PyCUDA to work on Snow Leopard? If not, any idea what might be wrong? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] Debian packaging of PyCUDA
On Dienstag 04 Mai 2010, Tomasz Rybak wrote: Hello, I have begun creating Debian PyCUDA (0.94 from GIT) package. It is my first Debian package and I was cheating by looking at other Python modules, but it seems to work, at least on my machine. In few days I should be able to check on another machine with Debian and CUDA, and will report if there are some problems. Now I am wondering whether there is need to integrate Debian packaging metadata into PyCUDA (it is directory with few text files in it) and if so what is the best way to do it. This is a great idea, ideally with the outcome that you would just apt-get install python-pycuda and get a working setup. Any DDs around? Anyone have spare cycles to make the packages? There's also an ITP for PyCUDA: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=546919 nvcc is apparently not packaged yet, either. Also the nvidia-glx packages tend to lag by a version or so. (Currently they have 190.53.) If we decide to go the home-rolled route: I remember that a while back there seemed to be a debate by the debian guys on whether a debian/ directory in upstream source is good or bad. [1] Does anyone know what the resolution of that is by now? This would determine whether I'd stick debian/ files into git. [1] http://kitenet.net/~joey/blog/entry/on_debian_directories_in_upstream_tarballs/ Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] creating a new view of a GPUArray
On Donnerstag 29 April 2010, Amir wrote: I would like to create different views of a GPUArray. I have no idea how to do it. 1D slicing logic already exists. See GPUArray.__getitem__ on how it's done. Let A be a 2D float32 c-continuous array. How do I create a 1D view of one of its rows? I am not sure how to set gpudata in the view. An example would be great. Is there a pointer arithmetic trick? This would have to happen along the lines of how the view is built above, but is hampered by the fact that our kernels don't yet have the necessary stride support. (I.e. you could already implement contiguous slices without touching the kernels, but non-contiguous ones need more work.) Is it possible to create a float32 view of a complex64 array? Yup, that should be easy. Just follow GPUArray.__getitem__. Is there a conventional numpy spelling for this? I am trying to use gpuarray.multi_take_put to copy slices between float and complex arrays but it requires dtypes to match. multi_take_put, in addition to being an undocumented feature, needs index arrays and is therefore likely overkill for what you want. Is using gpuarray.multi_take_put the best way to copy a slice of an linear array? No--'array[5:17].copy()' should be way faster, if you need a copy at all. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] How to manually free GPUarray to avoid leak?
On Sonntag 25 April 2010, gerald wrong wrote: Can I manually free GPUarray instances? In addition to Bogdan's comments (which are more likely to help you with what you're seeing): If you must free the memory by hand, you can use ary.gpudata.free() to do so. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] 32-bit PyCUDA on Snow Leopard
Hi Per, all, On Dienstag 20 April 2010, Per B. Sederberg wrote: Although it is not ideal and it took me many hours to figure out (as opposed to the 5 minutes it takes on Debian), I've been able to get PyCUDA and CUDA 3.0 working with 32-bit Enthought Python on Snow Leopard. thanks very much for these instructions. I've turned them into a Wiki page for easier community maintenance: http://wiki.tiker.net/PyCuda/Installation/Mac/EnthoughtPython Hope that's ok with you. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] cuBLAS on gpuarray
On Montag 12 April 2010, Bryan Catanzaro wrote: The only difference here is that at exit, we're detaching from the cuda context instead of popping it as pycuda.autoinit does. That gets rid of the error, although it's probably not the correct solution to the problem. detach() is not right if you're still holding references to stuff in the context you're detaching from. It's a tiny bit hairy that, now that (I think) we've finally got the whole context stack thing working, Nvidia prohibits it if you want CUBLAS. I'll think of a proper workaround--right now I'm thinking about tainting the current context once any runtime stuff gets called in it, and then preventing any push/pops. While we're at it: Team, what's the plan for bringing CUBLAS/CUFFT wrappers into PyCUDA? Bryan? Another thing: If I seem slow to respond at the moment, it's because I'm finishing my thesis and defending my PhD, hopefully I'll be all done by Apr 28. Wish me luck! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] installed_path incorrect in pycuda/compiler.py
On Donnerstag 01 April 2010, MinRK wrote: The `installed_path' variable in `_find_pycuda_include_path' in pycuda/compiler.py appears to be incorrect, or at least not sufficiently general, because it does not find the install location on my machines (OSX 10.6/Python 2.6.1 and Ubuntu 9.10/Python 2.6.4). $ python setup.py install --prefix=$HOME/usr/local installs pycuda to ~/usr/local/lib/python2.6/site-packages/pycuda and the pycuda headers to ~/usr/local/include but the installed_path variable is defined with: installed_path = join(pathname, .., include, pycuda) which points to: ~/usr/local/lib/python2.6/site-packages/include/pycuda, which is incorrect. adding three more .. fixes the location: installed_path = join(pathname, .., .., .., .., include, pycuda) and everything works once I add that patch. First of all, thanks for reporting this. I've added your patch to git. But really--this must be some sick joke, right? Currently PyCUDA looks in no fewer than *seven* places (on Linux) to figure out where its headers got installed. I hope Tarek (Ziadé) gets done cleaning up distutils rather sooner than later... or is this all the distributors' doing? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Trim down boost library?
On Dienstag 30 März 2010, reckoner wrote: Hi, I built the boost 1.38 libraries from source following the instructions on the wiki, but this generated about 5 GB of material. Do I need all of it, or can I trim this down? These here are the boost headers that PyCUDA includes. boost/cstdint.hpp boost/thread/thread.hpp boost/thread/tss.hpp boost/version.hpp boost/ptr_container/ptr_map.hpp boost/format.hpp boost/shared_ptr.hpp boost/python.hpp boost/python/stl_iterator.hpp boost/foreach.hpp (This implicitly tells you which Boost libraries are used, and you can probably infer what you need to keep and what to throw away. Debian's package management does a pretty good job of modularizing boost. You might also want to try http://www.boost.org/doc/libs/1_42_0/tools/bcp/doc/html/index.html Disclaimer: I haven't played with this. I realize boost is a big dependency, but you will find that its individual parts are actually quite small--it just includes everything and the kitchen sink [1]. So this is mainly a distribution problem IMO. Also I stand behind the choice of boost as a supporting library--it's quality software. Andreas [1] http://www.boost.org/doc/libs/1_42_0/libs/libraries.htm signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
[PyCUDA] [ANN] 0.94rc -- please test
Hi all, PyCUDA's present release version (0.93) is starting to show its age, and so I've just rolled a release candidate for 0.94, after tying up a few loose ends--such as complete CUDA 3.0 support. Please help make sure 0.94 is solid. Go to http://pypi.python.org/pypi/pycuda/0.94rc to download the package, see if it works for you, and report back. The change log for 0.94 is here: http://documen.tician.de/pycuda/misc.html#version-0-94 but the big-ticket things in this release are: - Support for CUDA 3.0 - Sparse matrices - Complex numbers Let's make this another another rockin' release! Thanks very much for your help, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] E LogicError: cuModuleLoadDataEx failed: invalid image - with test_driver.py
On Sonntag 28 März 2010, Catalin Patulea wrote: Sorry to butt in.. Reckoner, can you try again after applying the attached patch? It should address the invalid image errors. Catalin Good point! Thanks for the patch, applied to git master. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Problems with Context stack autoinit
On Freitag 26 März 2010, Bryan Catanzaro wrote: I've attached the trace. Lines beginning with --- are added instrumentation that I put in autoinit.py and cuda.hpp. Also, my workaround has now failed - with some versions of the code the attempt to push a bad context happened in device_allocation::free() - and deleting objects manually helped with that. But other times it fails in ~module(), and I'm not sure how to bypass that one. Do you have some short sample code that I could try on Linux? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Problems with Context stack autoinit
On Donnerstag 25 März 2010, Bryan Catanzaro wrote: Hi All - I've been getting problems with the following error: terminate called after throwing an instance of 'cuda::error' what(): cuCtxPushCurrent failed: invalid value After poking around, I discovered that context.pop(), registered using atexit in pycuda.autoinit, is being called *before* all the destructors for various things created during my program. This is by design. Since destructors may be called on out-of-context objects, they need to make sure that 'their' context is active anyway. In your case the context looks to have been *destroyed*, not merely switched. Can you run your code with CUDA tracing and send the log? (CUDA_TRACE=3D1 in siteconf.py) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] RuntimeError: cuInit failed: no device
On Donnerstag 18 März 2010, jade mackay wrote: I get the following error. Can anyone point me in the right direction to resolve this? import pycuda.autoinit Traceback (most recent call last): File stdin, line 1, in module File /usr/local/lib/python2.6/dist-packages/pycuda-0.93-py2.6-linux-x86_64.egg/ pycuda/autoinit.py, line 4, in module cuda.init() pycuda._driver.RuntimeError: cuInit failed: no device Permissions on the CUDA devices? /dev/nvidia* Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] pyCuda with python 2.4
On Mittwoch 17 März 2010, Daniel Chia wrote: HI Andreas, I did, however I need to define PY_SSIZE_T_MAX to get it to build. However I can't test the code as I don't have root access, so it seems I can't install pytools, cos it can't patch setuptools. I might try installing a copy python on my user account thought. With virtualenv [1] (best with option '--distribute'), you do not need root rights to install Python packages. (And virtualenv can be used even if it's not installed systemwide.) HTH, Andreas [1] http://pypi.python.org/pypi/virtualenv signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] complementary error function erfc
On Sonntag 14 März 2010, Faisal Moledina wrote: Hello PyCUDA list, I'm just starting out with PyCUDA and have not used much more than gpuarray and cumath. In fact, I have yet to program my own CUDA kernel. I'm wondering if there is a built-in erfc method for a gpuarray. If PyCUDA doesn't have one built-in, how would I implement a CUDA kernel for erfc? Prior to PyCUDA, I was using scipy.special.erfc on NumPy arrays. You could use to pycuda.elementwise to shield you from any actual CUDA programming--and use [1] as a (high-quality) implementation guideline. Maybe you could even tweak that header directly (add a few __device__ specs). [1] https://svn.boost.org/trac/boost/browser/trunk/boost/math/special_functions/erf.hpp Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Issue running test cases on Windows Vista 64 bit
On Donnerstag 11 März 2010, Conway, Nicholas J wrote: Installed distribute instead of setuptools and that fixed the quirks during installation, but did not fix the problem when running the test. This made sure pycuda was properly installed in the site-packages directory. I update my paths, and tried restricting the path for cl.exe to VC\bin\amd64 But I got this: Exec:Error: error invoking 'nvcc --cubin -arch sm_11 -IC:\Python26\lib\site-packages\pycudabeta-py2.6-win-amd64.egg\pycuda\..\i nclude\pycuda kernel.cu': status -1 invoking 'nvcc --cubin -arch sm_11 -IC:\Python26\lib\site-packages\pycuda-0.94beta-py2.6-win-amd64.egg\pycuda \..\includde\pycuda kernel.cu': nvcc fatal : nvcc cannot find a supported cl version. Only MSVC 8.0 and MSVC 9.0 are supported Does this here help? http://forums.nvidia.com/index.php?showtopic=73711st=0p=418880#entry418880 Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Issue running test cases on Windows Vista 64 bit
On Donnerstag 11 März 2010, Conway, Nicholas J wrote: Also tried CUDA 3.0 beta with the luck that it ran and crashes python during test_driver.py Did you recompile PyCUDA? Unfortunately, you need to delete the 'build' directory to be able to rebuild from scratch--distutils is unaware of dependency changes. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] pycuda 0.93 - Snow Leopard - error: invalid command 'bdist_egg'
On Montag 08 März 2010, Daniel Kubas wrote: Hi Andreas, yes I built the boost library (1.39) with the recommended flag 'architecture=x86' and even omitting '--with-libraries=signals,thread,python' If you haven't tried this already: Try poking at PyCUDA's _driver.so with 'otool -L' and then checking that all shared libraries depended on are actually 32-bit. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] pycuda 0.93 - Snow Leopard - error: invalid command 'bdist_egg'
On Montag 08 März 2010, Daniel Kubas wrote: Hi, It works now! Glad to hear that. (got this trick from http://mail.python.org/pipermail/python-list/2009-October/1222481.html) Bryan also put that trick on http://wiki.tiker.net/PyCuda/Installation/Mac#Notes_about_Snow_Leopard a while ago, I think. I just hope the 'UserWarning' is not critical and I can finally start using Pycuda for my science and find exoplanets with it ;-) It isn't. If you'd like to not see it, downgrade to Pytools 9 or upgrade PyCUDA to git master. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Patch for error C2143: syntax error : missing '; ' before 'type' on latest master for MSVC
On Freitag 05 März 2010, Ian Ozsvald wrote: Ok, I stepped back to my last working master copy from a few days back. I downloaded the raw blobs of your new changes via: http://git.tiker.net/pycuda.git/commitdiff/c3d5f8178f71271b8689915bc2d1122e 0f7b1f52 and then recompiled pyCUDA, deleted the temp kernels directory and ran my tests. The new code and includes are in site-packages. Manual git? I'm impressed. :) And yes, there was an issue with the public git tree. Fixed now. Sorry about that. My SimpleSpeedTest.py and Mandelbrot.py code compiles cleanly and runs without errors, I can see it using nvcc to compile new copies of the kernels. On first blush it looks as though your edits have fixed the problem, much obliged :-) Just played with your Mandelbrot thing--very nice! Thanks for sharing. Regarding default size: On my 260, 1000x1000 is still below a tenth of a second. :) It looks as though the main changes are removing _STLP_DECLSPEC from the .hpp and moving the __device__ declaration closer to the function definition (moving it from before the return-type declaration to after) in the hpp/cpp? Yup, that's what I did--as I said in an earlier message, apparently the position of '__device__' matters. It needs to be *before* the return type. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Int detection in function kernel invocation
On Donnerstag 04 März 2010, Fabrizio Milo aka misto wrote: Hi, I had to add the patch in attachment to make work a kernel like void kernel( float* out, int size){ } Unless you're using prepared invocation, you have to use Numpy's sized integers/floats: http://documen.tician.de/pycuda/driver.html#pycuda.driver.Function.param_set I don't think it's advisable to change that. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Patch for error C2143: syntax error : missing '; ' before 'type' on latest master for MSVC
On Donnerstag 04 März 2010, Ian Ozsvald wrote: Scratch the last - the same errors occur with the latest master as listed below. In my haste I didn't remove the already-compiled kernels (I cleared the wrong cache directory sigh). The fix is to comment out lines 312 and 457 of pycuda-complex.hpp and recompile, I still get a ton of warning messages (the same declspec ones as below) but I can run my examples. Try now. (Believe it or not, the warning messages actually helped. It appears that the __device__ must appear before the return type to be meaningful... maybe...) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] More Patches, and cuda.init() elimination
On Donnerstag 04 März 2010, Imran Haque wrote: Fabrizio Milo aka misto wrote: does anyone has an example of a program where doesn't use the cuda.autoimport before using any of the pycuda.* ? Yes, my library (shameless plug: https://simtk.org/home/siml) Cool--I've added that to http://wiki.tiker.net/PyCuda/ShowCase Hope you don't mind. Also: Everyone, if you have an application of PyCUDA that the world should know about, please click that link up there and hit edit! :) manually handles CUDA initialization, because just getting some context from autoinit is not sufficient - I want to be able to select which device I get a context on, and potentially have multiple contexts on multiple devices (the library supports context-switching to simultaneously use multiple GPUs from a single thread). Same here--my PDE solver also needs to handle init itself. For example, forking can become strangely dangerous after cuInit(). Therefore my code needs to control carefully when CUDA is initialized. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] More Patches, and cuda.init() elimination
On Donnerstag 04 März 2010, Fabrizio Milo aka misto wrote: The work around should simple be to import pycuda after the fork. Importing before would be useless, because for sure you can't initialize cuInit and thus can't use any cu* function.. Or I am missing something ? Imports might happen at module scope, not just where PyCUDA is actively used. Yes, it can all be worked around. But no, the behavior is not going to change. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Attempts of patches
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: Wouldn't benefit performance wise? No. What about creating a Device that is just a Proxy for the real _driver.Device class Device(object): def __init__(self,flags): _driver.init() self._device = _driver.Device(flags) That would be slower *and* cumbersome. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] OpenGl interop example
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: Errata corrige: Seems it can be simply None, but not 0 glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, w, h, 0, GL_RGBA, GL_UNSIGNED_BYTE, None) Fabrizio The wiki is the 'official' version of the examples, so you are able to fix this yourself. Please go ahead! Thanks, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Attempts of patches
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: I found the real problem in mac for opengl. Patch in attachment You can remove the previous setup.py logic Done, thanks. I think the design will benefit a lot from having a Device or Context class that manages all the resources on the device + alignments and other tricks device-related. Why? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Patch for error C2143: syntax error : missing '; ' before 'type' on latest master for MSVC
On Mittwoch 03 März 2010, Ian Ozsvald wrote: This error is described here: http://andre.stechert.org/urwhatu/2006/01/error_c2143_syn.html MSVC doesn't like C99-style variable declarations in the middle of the function and wants C89 declarations at the start of the function (or so the author states). Thx, applied. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Patch for error C2143: syntax error : missing '; ' before 'type' on latest m aster for MSVC
On Mittwoch 03 März 2010, Ian Ozsvald wrote: lude/pycuda\pycuda-complex.hpp(299): error: c alling a __device__ function from a __host__ function is not allowed I've added a few more fixes to git master. Can you please try it and report back? If it doesn't work, please post the entire error message. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] weird bug with exp
On Mittwoch 03 März 2010, Dan Goodman wrote: Could it be a 32/64 bit issue? I have a 64 bit Win7 machine, but my Python, numpy, etc. are 32 bit and so I had to compile PyCUDA using 32 bits (but the NVIDIA driver is 64 bit). Probably this shouldn't work at all, but it seems to work fine for everything except exp. Not sure what could be wrong. For the record, your code works on my machine (260, 64-bit Debian Linux) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Other Small patches
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: I think would be nice to alert the user if they are trying to pass a numpy.array directly to the kernel. Regarding the 'yet' in the error message: This works when using In/Out/InOut, but direct passing will otherwise never be supported. Further, I'd prefer if the test ended up in the existing error path to make sure it's performance-neutral. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] test_gpuarray.py is failing
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: Hi folks Test gpu_array is failing on my Macos, in attachment a small patch that fixes a bug in one of the tests and my gzipped-output running Patch: applied, thanks. I can't reproduce your issue, though--this works for me. What GPU, driver, compiler? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] More Patches, and cuda.init() elimination
On Mittwoch 03 März 2010, Fabrizio Milo aka misto wrote: In attachment more patches. The big one is the 006, which eliminates the need of calling esplicitly cuda.init(). The cuInit functions gets called upon _driver import in the init_driver Python-Module function. a) This would break a public interface and would need to happen with deprecation, warnings, etc. b) Imports with side effects are bad. (pycuda.autoinit is the only module with side-effects in PyCUDA--and the side effect is its only purpose.) I am still crunching the cuda_context stack and its implementation. I am not 100% sure of the 0002 patch indeed, but looks smelly that code. I think there is a way to eliminate completely the need of the autoinit, still investigating in it. 1: applied 2: applied, that's indeed a bug--surprisingly, that line was a function declaration. 3: already in 4: applied 5: me no like 6: see above Thanks for your work, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Installing on Windows XP 64 bit/Microsoft Visual Studio 2008
On Dienstag 02 März 2010, reckoner wrote: I ran test_driver.py and it looked like it was working okay, until it caused my screen to pixelate so much that I couldn't read it. Thanks in advance. This shouldn't happen--or rather, the driver should prevent this from happening. AFAIK, GPUs have some memory protection, so CUDA code can't really overwrite display state unless something's seriously wedged. I agree with Ian that you might be having driver or thermal issues. What version of the driver are you using? Can you try and upgrade to the latest-and-greatest? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] FFT for PyCuda
On Dienstag 02 März 2010, Bogdan Opanchuk wrote: If you'd like pycudafft to be part of PyCUDA itself, we can discuss how that could happen. I am not sure it is necessary. There is *nix ideology, which favors separated functionality. And you will have to add mako templating engine as a dependency for pycuda. But the final decision about architecture of your package is on you of course; it is not a problem for me to compose corresponding patch. I already changed plan interface a little in order to make it use shape/dtype parameters in a same way as numpy arrays and pycuda.gpuarrays. It's a double-edged sword, IMO. The simple-small-modular approach has obvious maintainability advantages. On the other hand, an integrated package is more convenient to install and depend on. I'll leave this up to you to decide. For now, I've added a link to pycudafft to the docs: http://documen.tician.de/pycuda/array.html#fast-fourier-transforms Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] possible bug ?
On Montag 01 März 2010, Fabrizio Milo aka misto wrote: I get strange errors on my macbook pro with the 3.0 Cuda I think there is an error invoking get_version should be get_version() diff --git a/pycuda/compiler.py b/pycuda/compiler.py index 140a098..0c13cf2 100644 --- a/pycuda/compiler.py +++ b/pycuda/compiler.py @@ -195,7 +195,8 @@ class SourceModule(object): arch, code, cache_dir, include_dirs) from pycuda.driver import get_version -if get_version (2,2,0): + +if get_version() (2,2,0): # FIXME This is wrong--these are per-function attributes. # Remove this in 0.94. def failsafe_extract(key, cubin): Good catch. What kind of error are you getting? My favorite way of fixing this would be releasing 0.94 soon because it simply eliminates this code. If you need this sooner and/or on the 0.93 branch, please let me know. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] questions on example
On Samstag 27 Februar 2010, Xueyu Zhu wrote: 11 const int i = threadIdx.x; I'd suggest you check this line here. :) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Installing on Windows XP 64 bit/Microsoft Visual Studio 2008
On Montag 01 März 2010, reckoner wrote: The problem I'm having with the above mentioned Using Visual Studio 2008 (alternative on January 2010) instructions is that I cannot get the examples in pycuda to work. It seems to fail at the stage of linking the nvcc-compiled code and I'm not sure why this happens. Can you post your error message? Thanks, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] possible bug ?
On Montag 01 März 2010, Fabrizio Milo aka misto wrote: pycuda._driver.LogicError: cuMemcpyHtoDAsync failed: invalid value Weird. I can't reproduce this on Linux. Anyone on Mac? Btw what is the best way to send you patches? Use a git checkout, commit your changes, then use 'git format-patch'. Andreas PS: Please make sure the list stays cc'd. Thanks. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Garbage after copying to and from shared memory
On Dienstag 09 Februar 2010, Bogdan Opanchuk wrote: Hello, Yet another stupid question. Most probably, I missed something obvious, but anyway - can someone explain why I get some NaN's in output for the program (listed below)? Surprisingly, bug disappears if I send '1' instead of '-1' as a third parameter to function (or remove 'int' parameters completely and leave only two pointers). Same kernel in pure Cuda works fine. Looks like memory corruption, but I can't figure out where it happens... This looks like a compiler bug to me. I've attached the PTX that the 3.0 compiler generates--apparently all your loops get unrolled, and then something gets confused, though I wasn't able to track down what exactly. Couple more data points: - Even in the first case (that you report as being ok), I get floating point garbage in the first 32 entries of b_gpu. - Adding an index bounds check to the second for loop also appears to fix things. Have you reported this to Nvidia? (If not, you should.) Andreas PS: Sorry for the long absence everybody. I was at a workshop, and then had lots to do on my return. Plus I have a thesis coming up, so please bear with me. :) .version 1.4 .target sm_13 // compiled with /home/andreas/pool/cuda-3.0/open64/lib//be // nvopencc 3.0 built on 2009-10-26 //--- // Compiling kernel.cpp3.i (/tmp/ccBI#.vZIo77) //--- //--- // Options: //--- // Target:ptx, ISA:sm_13, Endian:little, Pointer Size:64 // -O3 (Optimization level) // -g0 (Debug level) // -m2 (Report advisories) //--- .file 1 command-line .file 2 kernel.cudafe2.gpu .file 3 /usr/lib/gcc/x86_64-linux-gnu/4.4.3/include/stddef.h .file 4 /home/andreas/pool/cuda/bin/../include/crt/device_runtime.h .file 5 /home/andreas/pool/cuda/bin/../include/host_defines.h .file 6 /home/andreas/pool/cuda/bin/../include/builtin_types.h .file 7 /home/andreas/pool/cuda/bin/../include/device_types.h .file 8 /home/andreas/pool/cuda/bin/../include/driver_types.h .file 9 /home/andreas/pool/cuda/bin/../include/surface_types.h .file 10 /home/andreas/pool/cuda/bin/../include/texture_types.h .file 11 /home/andreas/pool/cuda/bin/../include/vector_types.h .file 12 /home/andreas/pool/cuda/bin/../include/device_launch_parameters.h .file 13 /home/andreas/pool/cuda/bin/../include/crt/storage_class.h .file 14 /usr/include/bits/types.h .file 15 /usr/include/time.h .file 16 /home/andreas/pool/cuda/bin/../include/texture_fetch_functions.h .file 17 /home/andreas/pool/cuda/bin/../include/common_functions.h .file 18 /home/andreas/pool/cuda/bin/../include/crt/func_macro.h .file 19 /home/andreas/pool/cuda/bin/../include/math_functions.h .file 20 /home/andreas/pool/cuda/bin/../include/device_functions.h .file 21 /home/andreas/pool/cuda/bin/../include/math_constants.h .file 22 /home/andreas/pool/cuda/bin/../include/sm_11_atomic_functions.h .file 23 /home/andreas/pool/cuda/bin/../include/sm_12_atomic_functions.h .file 24 /home/andreas/pool/cuda/bin/../include/sm_13_double_functions.h .file 25 /home/andreas/pool/cuda/bin/../include/sm_20_atomic_functions.h .file 26 /home/andreas/pool/cuda/bin/../include/sm_20_intrinsics.h .file 27 /home/andreas/pool/cuda/bin/../include/surface_functions.h .file 28 /home/andreas/pool/cuda/bin/../include/math_functions_dbl_ptx3.h .file 29 kernel.cu .entry test ( .param .u64 __cudaparm_test_in, .param .u64 __cudaparm_test_out, .param .s32 __cudaparm_test_dir, .param .s32 __cudaparm_test_S) { .reg .u32 %r17; .reg .u64 %rd8; .reg .f32 %f81; .shared .align 4 .b8 __cuda_sMem24[8192]; .loc29 3 0 $LBB1_test: .loc29 8 0 mov.f32 %f1, 0f;// 0 mov.f32 %f2, %f1; mov.f32 %f3, 0f;// 0 mov.f32 %f4, %f3; mov.f32 %f5, 0f;// 0 mov.f32 %f6, %f5; mov.f32 %f7, 0f;// 0 mov.f32 %f8, %f7; mov.f32 %f9, 0f;// 0 mov.f32
Re: [PyCUDA] Incorrect shared memory size for kernel
On Sonntag 07 Februar 2010, Bogdan Opanchuk wrote: .entry test ( .param .u32 __cudaparm_test_out) { .reg .u32 %r3; .reg .f32 %f4; .loc15 192 0 $LBB1_test: .loc15 198 0 ld.param.u32%r1, [__cudaparm_test_out]; mov.f32 %f1, 0f3f80;// 1 st.global.f32 [%r1+0], %f1; .loc15 199 0 mov.f32 %f2, 0f4000;// 2 st.global.f32 [%r1+4], %f2; .loc15 200 0 exit; $LDWend_test: } // test But when I'm trying to compile this kernel with PyCuda, for some reason this function has attribute shared_size_bytes==20. Can anyone please explain why is the size of shared memory non-zero? I am completely at a loss here. I think kernel parameters (and apparently a bunch of other stuff) map to shared memory. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Installing PyCUDA windows vista x32
On Sonntag 07 Februar 2010, Marco André Argenta wrote: C:\Python26\lib\distutils\dist.py:266: UserWarning: Unknown distribution option: 'install_requires' warnings.warn(msg) I can't say much about the error message, but the warning above makes me suspect that something relating to the setuptools-vs-distribute disaster may be at work here. See http://wiki.tiker.net/DistributeVsSetuptools, and let me (and the list) know if it helped. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Sharing is caring
On Sonntag 31 Januar 2010, Per B. Sederberg wrote: Perhaps, similar to the showcase on the wiki, we could add an examples page: http://wiki.tiker.net/PyCuda/ShowCase Andreas, what do you think? Good idea. See http://wiki.tiker.net/PyCuda/Examples The examples/ subdirectory now has a link to that web page and only contains the most basic examples to get people started. In addition, the examples directory now also contains a script that automatically downloads all the Wiki examples to keep them as easily runnable as before. Thanks for the suggestion! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] PyCUDA academic citation
Hi Imran, On Samstag 30 Januar 2010, Imran Haque wrote: Is there a particular paper or conference presentation that you'd like cited for PyCUDA in academic papers? It's the least we can do for your efforts! http://arxiv.org/abs/0911.3456 We've also submitted this to Parallel Computing (Elsevier), but haven't heard back yet. Thanks for asking--much appreciated! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Complex number support?
On Mittwoch 27 Januar 2010, Ian Ozsvald wrote: Hi Andreas/Ying Wai, I see a discussion you've had about complex number support: http://www.mail-archive.com/pycuda@tiker.net/msg00788.html I also see the 'complex' tag: http://git.tiker.net/pycuda.git/commit/296810d8c57f7620cfcc959f73f6aefbb021 5133 and I've merged the code with mine. (It's a branch, actually.) The right way to get it is like so: If you already have a complex branch (see 'git branch') $ git checkout complex # switches you to your complex branch $ git pull http://git.tiker.net/pycuda.git/ complex # fetch+merge changes If you don't already have a complex branch $ git fetch http://git.tiker.net/pycuda.git/ complex:complex I'd advise against messing with raw commit SHAs. When I try to run demo_complex.py I get an error (below) - should the demo work without an error? Works for me, prints some small number. I just updated the complex branch to current master--pull and try again. Given the discussion you were both having I'm not clear whether the complex support is finished or not? It's not quite finished, the current hangup is to figure out how complex scalars are to be passed to kernels. My current preference is to rely on numpy's buffer interface, like so: import numpy x = numpy.complex64(123+456j) str(buffer(x)) '\x00\x00\xf6B\x00\x00\xe4C' Let me know how things go. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Windows runtime error ImportError: DLL load failed: The specified module could not be found.
On Dienstag 26 Januar 2010, Ian Ozsvald wrote: All done: http://wiki.tiker.net/PyCuda/Installation/Windows#Using_Visual_Studio_2008_ .28alternative_on_January_2010.29 Cool. Thanks very much. i. ps. Andreas I still get the REPLY field configured as Andreas Klöckner li...@informa.tiker.net when I hit reply to any of your messages rather than pycuda@tiker.net That's fine--the list is configured as reply-to-poster rather than reply-to-list. As long as my address doesn't show as something involving monster, everything is working as designed. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Get PyCuda 0.93 working on Snow Leopard
Hi Krunal, first of all, welcome, and thanks for writing up your experience. I'm sorry that your install was as troublesome as it seems to have been. To make things better for everyone, I would like to ask two favors of you: 1) If there is anything that we can do by default in PyCUDA to make a default Mac install less painful, please let me know. For example, should -m32 be the default for Mac installs? 2) Next, while many people might eventually find this information on the mailing list, the Wiki (that you refer to) is designed to be the main place for up-to-date installation information. It's a wiki for a reason--to be edited! Don't be shy, and change that information where it's wrong, or augment it where additional info is needed. If I can be of any assistance, please let me know. Thanks again, Andreas On Samstag 23 Januar 2010, Krunal Patel wrote: Hi there, my name is Krunal and I'm a new user to PyCUDA. I'm helping Nicolas out on his research. My first order of business was to get PyCUDA installed on my iMac Snow Leopard. Here were my experiences: 1) stick with the version of python installed by Mac, don't upgrade to the latest. The default version is 2.6.1. The main reason for this is because I couldn't figure out how to run a newly upgraded version of python in 32-bit mode. 2) I tried 1.38-1.41 versions of Boost. At the end, the one that worked for me was 1.39. Here is how I got things working: 1) first follow: http://wiki.tiker.net/BoostInstallationHowto but do this: ./bootstrap.sh --prefix=$HOME/pool --libdir=$HOME/pool/lib --with-libraries=signals,thread,python ./bjam address-model=32_64 architecture=x86 variant=release link=shared install (the architecture stuff comes from http://old.nabble.com/Boost-on-Snow-Leopard-failing-to-build-32-bit-target -td25218466.html and is the key to not getting the architecture is invalid when running test_driver.py) 2) then follow: http://wiki.tiker.net/PyCuda/Installation/Mac but do this: python configure.py --boost-inc-dir=/Users/k5patel/pool/include/boost-1_39/ --boost-lib-dir=/Users/k5patel/pool/lib/ --boost-python-libname=boost_python --cuda-root=/usr/local/cuda/ siteconf.py shoud look like: BOOST_INC_DIR = ['/Users/k5patel/pool/include/boost-1_39/'] BOOST_LIB_DIR = ['/Users/k5patel/pool/lib/'] BOOST_COMPILER = 'gcc42' BOOST_PYTHON_LIBNAME = ['boost_python-xgcc42-mt'] BOOST_THREAD_LIBNAME = ['boost_thread-xgcc42-mt'] CUDA_TRACE = False CUDA_ROOT = '/usr/local/cuda/' CUDA_ENABLE_GL = False CUDADRV_LIB_DIR = [] CUDADRV_LIBNAME = ['cuda'] CXXFLAGS = [] LDFLAGS = [] change setup.py so it has this: if 'darwin' in sys.platform: # prevent from building ppc since cuda on OS X is not compiled for ppc if -arch not in conf[CXXFLAGS]: conf[CXXFLAGS].extend(['-arch', 'i386','-m32']) if -arch not in conf[LDFLAGS]: conf[LDFLAGS].extend(['-arch', 'i386','-m32']) the key is -m32 as indicated by http://article.gmane.org/gmane.comp.python.cuda/1089 3) do sudo make install. ensure that there are no warnings around the line of different architecture 4) once built, go to the test directory and type in: python test_driver.py hopefully it passes don't type: sudo python test_driver.py as indicated elsewhere I hope this could be of help to those who couldn't get PyCUDA 0.93 working on their system. The main problem has to do with SL's 64-bitness and CUDA's lack of it (and hence PyCUDA) krunal signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Get PyCuda 0.93 working on Snow Leopard
On Samstag 23 Januar 2010, Krunal Patel wrote: Yes I think the default should be -m32. Done in git. Thanks for your advice. I have done the needful on the wiki pages. Thank you very much for your work! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Get PyCuda 0.93 working on Snow Leopard
On Samstag 23 Januar 2010, Andreas Klöckner wrote: I have done the needful on the wiki pages. Thank you very much for your work! I've hacked the Wiki a little bit--can you please take a quick look? Thanks! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] How do I make a multichannel 1D texture?
On Dienstag 12 Januar 2010, Dan Piponi wrote: I'm having trouble figuring out how to make a 4 channel 1D texture for use with tex1D. I can easily make a 2D 4 channel texture, from an MxNx4 numpy 3D array, using make_multichannel_2d_array and bind_array_to_texref. The third axis of the array has size 4 and becomes the 4 channels in a float4 texture. Works fine with tex2D. But I can't find a sequence of calls that takes an Nx4 numpy 2D array, with size 4 in the second axis, and turns it into a 4 channel 1D texture suitable for use with tex1D. I can't find something corresponding to make_multichannel_1d_array. So starting with an Nx4 2D numpy array, what sequence of calls do I need to make to get 1D texture with float4 elements? Do you really want tex1D and not tex1Dfetch? If the latter, http://is.gd/6kv0P Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Compile problems for pycuda on Karmic 64
On Freitag 15 Januar 2010, John Zbesko wrote: KeyError: '_driver' http://is.gd/6kvv9 HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] PyCUDA on Snow Leopard
Hi Bryan, all, On Sonntag 10 Januar 2010, Bryan Catanzaro wrote: I also had this problem. Python on Snow Leopard defaults to a 64-bit executable. You can check this by typing: import sys print sys.maxint If it's ~2 billion, you're running Python in 32-bit mode. If it's a huge number, you've got a 64-bit Python, which conflicts with your 32-bit CUDA/PyCUDA drivers and can cause your error message. You can force Python on Snow Leopard to run in 32-bit by setting a special environment variable: VERSIONER_PYTHON_PREFER_32_BIT=yes Or you can make this change permanent and global by writing this to the defaults: defaults write com.apple.versioner.python Prefer-32-Bit 1 thanks for the information. I've added this to the Wiki. In the interest of knowledge retention, I'd like to request that if you run into unforeseen trouble, please write about it on the wiki. In particular, please include the error message so the issue becomes googlable. You do not need a user name to edit the Wiki. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Windows runtime error ImportError: DLL load failed: The specified module could not be found.
On Freitag 08 Januar 2010, Ian Ozsvald wrote: Can anyone suggest any reasons why boost is looking for python25.dll rather than the 2.6 equivalent? Check boost's project-config.jam. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] FFT code gets error only when Python exit
On Donnerstag 07 Januar 2010, Ying Wai (Daniel) Fan wrote: I changed cuComplex_mod.h a bit to force the use of complex.h. Looks like the route of using GNU C library does not work. Complex arithmetic operations are regarded as host functions by CUDA and host functions cannot be called from device. I got the following errors: So I guess shipping the hacked STLport header is our only option then, right? (Which is not bad--it's decent code.) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] RuntimeError: could not find path to PyCUDA's C header files on Mac OS X Leopard, CUDA 2.3, pyCuda 0.94beta, python 2.5
On Donnerstag 07 Januar 2010, Ian Ozsvald wrote: I'm having a devil of a time getting pyCUDA to work on my MacBook and I can't get past this error: host47:examples ian$ python demo.py Traceback (most recent call last): File demo.py, line 22, in module ) File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 203, in __init__ arch, code, cache_dir, include_dirs) File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 188, in compile include_dirs = include_dirs + [_find_pycuda_include_path()] File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 149, in _find_pycuda_include_path raise RuntimeError(could not find path to PyCUDA's C header files) RuntimeError: could not find path to PyCUDA's C header files Below I have version info and the full build process leading up to the error...Any pointers would be *hugely* appreciated. If someone could explain what's happening here and what pyCUDA is looking for, that might point me in the right direction. Am I missing something silly? I spent yesterday banging my head against the 'make' process until I found a spurious '=' in the ./configuration.py arguments (entirely my fault), maybe I've missed something silly here too? See if you can find pycuda-helpers.hpp under /Library/Python/2.5/site- packages/ somewhere, we may need to adapt _find_pycuda_include_path(). It's quite interesting to see where all this stuff can end up... Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] 0.94 woes
On Donnerstag 07 Januar 2010, Nicholas S-A wrote: Well, if somebody writes code that uses arch=sm_13 for some reason, then somebody who doesn't have a 200 series card tries to use it, then the error message which comes up is pretty cryptic. Just trying to make it more understandable. It is an error that they should not have made in the first place -- so perhaps it should be a LogicError? Downgraded to a warning, merged. Thanks for the patch, Andreas PS: Please do keep the list cc'd. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] My modification to PyCUDA
On Mittwoch 30 Dezember 2009, Ying Wai (Daniel) Fan wrote: Andreas, I have done some changes to make arithmetic operation works with complex GPUArray objects. The patch is attached. I don't quite agree with your treatment of the complex scalars. Couple possibilities: 1) We ship a fixed version of the struct module that can serialize complex numbers. 2) struct can be fooled into serializing 2f and get the real and imag components separately. 3) We serialize numpy complex byte-for-byte via buffer(numpy.complex(3j)), circumventing the need for struct to understand complex. What do you think? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] FFT code gets error only when Python exit
n Mittwoch 06 Januar 2010, Ying Wai (Daniel) Fan wrote: Now in your situation there's a failure when reactivating the context to detach from it, probably because the runtime is meddling about. The only reason why cuCtxPushCurrent would throw an invalid value, is, IMO, if that context is already somewhere in the context stack. So it's likely that the runtime reactivated the context. In current git, a failure of the PushCurrent call should not cause a failure any more--it will print a warning instead. I believe different contexts can't share variables that is on the GPUs. True. I can use GPUArray objects as arguments to my fft functions, and these objects still exist after fft. So I think fft is using the same context as pycuda. Right. I take it that if runtime functions execute when a driver context exists, they'll reuse that context. I made the change indicated in the attached diff file, such that context.synchronize() and context.detach() would print out the context stack size, and detach() would also print out whether current context is active. With this I verify that the stack size is 1 before and after running fft code and the context does not change. I should clarify here. CUDA operates one context stack, and PyCUDA has another one. CUDA's isn't sufficient because it will not let the same context be activated twice. PyCUDA on the other hand needs exactly this functionality, to ensure that cleanup can happen whenever it is needed. Hence PyCUDA maintains its own context stack, and keeps CUDA's stack at most one deep. You are looking at the PyCUDA stack. My guess is that CUFFT make some change to the current context, such that once this context is poped, it is automatically destroyed. I disagree. I think CUDA somehow lets us pop the context, does not report an error, but also does not actually pop the context (since the runtime is still talking to it). Then, when PyCUDA tries to push the context back onto CUDA's stack to detach from it, that fails. I've filed a bug report with Nvidia, we'll see what they say. If my guess is correct, then calling context.detach() would destroy the context, since its usage count drops to 0, and it could circumvent the warning message when the context destructor is called. Pop only removes the context from the context stack (and hence deactivates it), but retains a reference to it. It should not cause anything to be destoyed. I don't want people using my package to see warning message when Python exit, so I am not using autoinit in my package, but to create a context explicitly. Sorry for this mess--I hope we can sort it out somehow. The following is kind of unrelated. I have done some experiments with contexts. I think context.pop() always pops up the top context from the stack, disregarding whether context is really at the top of the stack. E.g. I create two contexts c1 and c2 and then I can do c1.pop() twice without getting error. This points to a doc and behavior bug in PyCUDA. Context.pop() should have been static. It effectively was, but not quite. Fixed in git. cuComplex.h exists since CUDA 2.1 and it hasn't changed in subsequent version. cuComplex.h is used by cufft.h and cublas.h. I can't find any documentation to it. A quick search on google shows that JCuda seems to be using it. http://www.jcuda.org/jcuda/jcublas/doc/jcuda/jcublas/cuComplex.html Hmm. A quick poke comes up with an error message: 8 -- kernel.cu(7): error: no operator * matches these operands operand types are: cuComplex * cuComplex 8 -- Code attached. This might not be what we're looking for. Maybe we can simply use complex.h from GNU C library. A quick seach on my Ubuntu machine locates the following files: /usr/include/complex.h, which includes /usr/include/c++/4.4/complex.h, which then includes /usr/include/c++/4.4/ccomplex, which in turn includes /usr/include/c++/4.4/complex, which includes overloading of operators for complex number. Two words: Windows portability. :) Aside from that, this is unlikely to work, as the system-wide complex header depends on I/O being available and all kinds of other system-dependent funniness. Good luck on your PhD. Likewise! Andreas import pycuda.driver as drv import pycuda.tools import pycuda.autoinit import numpy import numpy.linalg as la from pycuda.compiler import SourceModule mod = SourceModule( #include cuComplex.h __global__ void multiply_them(cuComplex *dest, cuComplex *a, cuComplex *b) { const int i = threadIdx.x; dest[i] = a[i] * b[i]; } ) multiply_them = mod.get_function(multiply_them) a = numpy.random.randn(400).astype(numpy.complex64) b = numpy.random.randn(400).astype(numpy.complex64) dest = numpy.zeros_like(a) multiply_them( drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1)) print dest-a*b signature.asc Description: This
Re: [PyCUDA] pycuda.driver.Context.synchronize() delay time is a function of the count and kind of sram accesses?
Couple points: * bytewise smem write may be slow? * sync before and after timed operation, otherwise you time who knows what * or, even better, use events. HTH, Andreas On Samstag 02 Januar 2010, Hampton G. Miller wrote: I have noticed something which seems odd and which I hope you will look at and then tell me if it is something unique to PyCUDA or else is something which should be brought to the attention of Nvidia. (Or, that I am just a simpleton!) Looking at my test results, below, and referring to my attached Python program with comments, it seems to me that the amount of time taken by pycuda.driver.Context. synchronize() is strongly a function of the count and type of sram accesses. This seems odd to me. Do you agree? For example, it takes over 13 seconds to sync after doing nothing more than writing zeros to (almost) all of the sram bytes for a 512x512 grid! Regards, Hampton PyCUDA 0.93 running on Mint 7 Linux Using device GeForce 9800 GT gridDim_x gridDim_y blockDim_x blockDim_y blockDim_z A B C D E F G H 0:1 1 1 1 1 0.001050 0.000120 0.000442 0.72 0.000257 0.69 0.68 0.69 1:1 1 512 1 1 0.000828 0.72 0.000441 0.73 0.000257 0.70 0.69 0.69 2:1 100 512 1 1 0.007309 0.000167 0.003026 0.000106 0.001546 0.72 0.72 0.72 3: 100 1 512 1 1 0.005985 0.77 0.003016 0.71 0.001543 0.73 0.72 0.71 4: 100 100 512 1 1 0.526857 0.000303 0.263423 0.000302 0.131828 0.000304 0.000311 0.000210 5:1 256 512 1 1 0.014104 0.000167 0.007073 0.75 0.003572 0.76 0.76 0.73 6: 256 1 512 1 1 0.014087 0.81 0.007069 0.77 0.003570 0.93 0.77 0.73 7: 256 256 512 1 1 3.447902 0.001038 1.724391 0.001039 0.862664 0.001041 0.001586 0.000957 8:1 512 512 1 1 0.027301 0.61 0.013667 0.46 0.006857 0.45 0.50 0.44 9: 512 1 512 1 1 0.027314 0.000125 0.013669 0.47 0.006855 0.45 0.49 0.44 10: 512 512 512 1 1 13.789054 0.003796 6.896283 0.003800 3.449923 0.003794 0.006229 0.003898 31.298553 secs total #!/usr/bin/env python # nvidia_example.py - import sys import os import time import numpy import pycuda.autoinit import pycuda.driver as cuda from pycuda.compiler import SourceModule gridDim_x = 1 gridDim_y = 1 blockDim_x = 1 blockDim_y = 1 blockDim_z = 1 gridBlockList = [ (1, 1, 1,1,1), ( 1,1, 512,1,1), (1,100, 512,1,1), (100,1, 512,1,1), (100,100, 512,1,1), (1,256, 512,1,1), (256,1, 512,1,1), (256,256, 512,1,1), (1,512, 512,1,1), (512,1, 512,1,1), (512,512, 512,1,1) ] # === === cuda.init() device = pycuda.tools.get_default_device() print Using device, device.name() dev_dataRecords = cuda.mem_alloc( 1024 * 15 ) # --- --- krnl = SourceModule( __global__ void worker_0 ( char * src ) { __shared__ char dst[ (1024 * 15) ]; int i; if ( threadIdx.x == 0 ) {// Case A: for( i=0; isizeof(dst); ++i )// Count = sizeof(dst) dst[ i ] = 0;// Type = indexed by i dst[ 0 ] = dst[ 1 ];// (Gag set but never used warning message from compiler) }; } __global__ void worker_1 ( char * src ) { __shared__ char dst[ (1024 * 15) ]; int i; if ( threadIdx.x == 0 ) {// Case B: for( i=0; isizeof(dst); ++i )// Count = sizeof(dst) dst[ 0 ] = 0;// Type = always the same element, 0 dst[ 0 ] = dst[ 1 ]; }; } __global__ void worker_2 ( char * src ) { __shared__ char dst[ (1024 * 15) ]; int i; if ( threadIdx.x == 0 ) {// Case C: for( i=0; i(sizeof(dst)/2); ++i )// Count = sizeof(dst)/2 dst[ i ] = 0;// Type = indexed by i dst[ 0 ] = dst[ 1 ]; }; } __global__ void worker_3 ( char *
Re: [PyCUDA] PyCuda installation instructions for Gentoo Linux
On Dienstag 22 Dezember 2009, Justin Riley wrote: Hi All, I've added a page for installing PyCuda on Gentoo Linux to the PyCuda wiki. Sweet, thanks! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] CompileError
On Montag 21 Dezember 2009, Ewan Maxwell wrote: test_driver.py works(all 16 tests), the example code i mentioned works to however, other tests still fail with the same reason(nvcc fatal~) That's strange. If all of them have the same path, why would some fail and some succeed? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] PyCUDA Digest, Vol 18, Issue 12
On Montag 21 Dezember 2009, oli...@olivernowak.com wrote: like i said. a week. Can you comment on what made this difficult? Can we do anything to make this easier, e.g. by catching common errors and providing more helpful messages? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Passing struct into kernel
On Freitag 18 Dezember 2009, Dan Piponi wrote: go.prepare(s, ...) # tell PyCuda we're passing in a string buffer go.prepared_call(grid, struct.pack(i,12345))# pack integer into string buffer struct_typestr = i sz = struct.calcsize(struct_typestr) go.prepare(%ds % sz, ...) # tell PyCuda we're passing in a string buffer go.prepared_call(grid, struct.pack(struct_typestr, 12345)) codepy.cgen.GenerableStruct can help you generate type strings and C source code from a single source and will also help you make packed instances. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Passing struct into kernel
On Freitag 18 Dezember 2009, Dan Piponi wrote: I have no problem making a struct in global memory and passing in a pointer to it. But arguments to kernels get stored in faster memory than global memory don't they? Right. You can pack a struct into a string using Python's struct module, and since string support the buffer protocol, they can be passed to PyCUDA routines. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] cuModuleGetFunction failed: not found
On Montag 14 Dezember 2009, Robert Cloud wrote: Traceback (most recent call last): File stdin, line 1, in module pycuda._driver.LogicError: cuModuleGetFunction failed: not found The problem is that nvcc compiles code as C++ by default, which means it uses name mangling [1]. If you don't want to use PyCUDA's just-in-time compilation facilities [2], then just add an 'extern C' to your declarations. Andreas [1] http://en.wikipedia.org/wiki/Name_mangling [2] http://documen.tician.de/pycuda/driver.html#module-pycuda.compiler signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Runtime problem: cuDeviceGetAttribute failed: not found
On Dienstag 08 Dezember 2009, you wrote: It's the first time I've installed the drivers, so I don't have multiple versions. Anyway how can I know the versions of the headers used in the compilation? pycuda.driver.get_version() pycuda.driver.get_driver_version() Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Question about gpuarray
On Freitag 04 Dezember 2009, Bryan Catanzaro wrote: Thanks for the explanation. In that case, do you have objections to removing the assertion that if a GPUArray is created and given a preexisting buffer, that the new array must be a view of another array? In my situation, I don't think this assertion is true: I would like to transfer ownership of a gpu buffer (created outside of PyCUDA by some host code) to a particular GPUArray. This means I instantiate a GPUArray with gpudata=the pointer created by the host code, but base should still be None, since this new GPUArray is not a view of any other array, and so this GPUArray should have sole ownership of the buffer being given at initialization. If I understand you correctly, then whatever you assign to .gpudata already establishes the lifetime dependency, right? In that case, yes, the assert should go away. Andreas PS: Please keep the PyCUDA list cc'ed, unless there's good reason not to. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] bad argument to internal function
On Mittwoch 25 November 2009, Ken Seehart wrote: Something like this came up for someone else in May: http://www.mail-archive.com/pycuda@tiker.net/msg00361.html *SystemError: ../Objects/longobject.c:336: bad argument to internal function* buf = struct.pack(format, *arg_data) I fixed it by hacking *_add_functionality* in *driver.py*. As before when Bryan reported the bug, I can't seem to reproduce it. (Likewise, I can't reproduce the issue described in the corresponding numpy ticket [1].) [1] http://projects.scipy.org/numpy/ticket/1110 What architecture are you running on? What version of Python? What version of numpy? (Can't reproduce on x86_64+2.5.4+1.3.0 and x86_64+2.6.1+{1.2.1 and 1,3.0}.) In principle, I'm not opposed to merging this fix, but I'd like some more information first. Bryan: are you still encountering this? Any further information? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] CUDA 3.0 64-bit host code
Hey Bryan, On Montag 23 November 2009, Bryan Catanzaro wrote: I built 64-bit versions of Boost and PyCUDA on Mac OS X Snow Leopard, as well as the 64-bit Python interpreter supplied by Apple, as well as the CUDA 3.0 beta. Everything built fine, but when I ran pycuda.autoinit, I got an interesting CUDA error, which PyCUDA reported as pointer is 64-bit. I'm wondering - is it impossible to use a 64-bit host program with a 32-bit GPU program under CUDA 3.0? First, I'm not sure I fully understand what's going on. You can indeed compile GPU code to match a 32-bit ABI on a 64-bit machine (nvcc --machine 32 ...). Is that what you're doing? If so, why? (Normally, nvcc will default to your host's ABI. By and large, this changes struct alignment rules and pointer widths.) If you're not doing anything special to get 32-bit GPU code, then your GPU code should end up matching your host ABI. Or maybe nvcc draws the wrong conclusions or is a fat binary or something and we need to actually specify the --machine flag. I also remember wondering what the error message referred to when I added it. I'm totally not sure. Which routine throws it? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Install issue with Ubuntu 9.10
On Sonntag 22 November 2009, you wrote: Now this error: python test_driver.py Traceback (most recent call last): File test_driver.py, line 481, in module from py.test.cmdline import main ImportError: No module named cmdline Again, py.test should have been installed automatically for you. easy_install -U py should do that for you. (Also see http://is.gd/4VBoW) Andreas PS: Please keep your replies on-list. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Build Problems on SuSE 11.0, x86_64 - cannot find -llibboost_python-mt
On Samstag 21 November 2009, Wolfgang Rosner wrote: Hello, Andreas Klöckner, I can't get pycuda to build on my box. I tried at least 10 times, try to reproduce details now. In the archive, there was a similar thread with a SuSE 11.1 64 bit box , posted by Jonghwan Rhee http://article.gmane.org/gmane.comp.python.cuda/955 but that did not really provide a solution. cannot find -llibboost_python-mt although the file is in place First, all that configure.py does is edit siteconf.py--no need to rerun it once sitconf.py is in place. Second, -lsomething implicitly looks for libsomething.so. No need to specify the lib prefix. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Build Problems on SuSE 11.0, x86_64 - cannot find -llibboost_python-mt
On Samstag 21 November 2009, Wolfgang Rosner wrote: OK, for me it works now, but peomple might be even more (and earlier) happy if the pytools issue had been mentioned in the setup wiki. Pytools should be installed automatically along with 'python setup.py install'. If it didn't: do you have any idea why? If you like and give me an wiki account, I'd go to share my experience. No account required. (though you can create one yourself) Please do share. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Build Problems on SuSE 11.0, now Pytools not found
On Samstag 21 November 2009, Wolfgang Rosner wrote: Pytools should be installed automatically along with 'python setup.py install'. If it didn't: do you have any idea why? not sure. could it be that I ran make install instead of python setup.py install ? make install invokes python setup.py install. That shouldn't have been it. (sorry, I'm just getting used to Python, preferred perl in earlier live) first I thought it was due to the different python path structures. standard on SuSE is /usr/lib64/python2.5/site-packages/ but the egg-laying machine seems to put stuff to /usr/local/lib64/python2.5/site-packages instead (maybe I could have reconfigured this, anyway) Weird. Curious about the reasoning behind this. However, gl_interop.py did not run until I did export PYTHONPATH=/usr/local/lib64/python2.5/site-packages/ (was PYTHONPATH= before) maybe this is since there is still the old python-opengl-2.0.1.09-224.1 /usr/lib64/python2.5/site-packages/OpenGL/GL/ARB/ ...with-no-vertex-buffer-in-there in the way which is caught before. But to figure it out I'm definitely lacking sufficient python experience. There is an easy trick to find out what file path actually gets imported: import pytools pytools.__file__ '/home/andreas/research/software/pytools/pytools/__init__.py' hm, might give it a try. I think best I could offer was be to prepare an own SuSE page with my experience. Sure--just add a subpage under http://wiki.tiker.net/PyCuda/Installation/Linux (like the one for Ubuntu) It all comes down to different ways and places where stuff is stored. But I think my approach is not the best one, in the view back it were better to configure new stuff so that it meets SuSE structure. Maybe. Well, but this might break other dependencies? Smells like big 'Baustelle'... So if your expectation of quality on your wiki is not too high, I'll post my experience there. That's the whole point of a Wiki: information of questionable quality that people improve as they use it. It's a knowledge retention tool. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] autotuning
On Freitag 20 November 2009, James Bergstra wrote: Now that we're taking more advantage of PyCUDA's and CodePy's ability to generate really precise special-case code... I'm finding that we wind up with a lot of ambiguities about *which* generator should handle a given special case. The right choice for a particular input structure is platform-dependent--a function of cache sizes, access latencies, transfer bandwidth, register counts, number of processors, etc, etc. The wrong choice can carry a big performance penalty. FFTW and ATLAS get around this by self-tuning algorithms, which I don't understand in detail, but which generally work by trying a lot of generators on a lot of special cases, and then using the database of timings to make good choices quickly at runtime. What has worked well for me is to try a big bunch of kernels right before their intended use and cache which one was fast for this special case only. The main delay is the compilation of all these kernels, the trial runs are all very quick, thanks to the GPU. There's just enough caching at each level to make this efficient. It seems like this automatic-tuning is even more important for GPU implementations than for CPU ones. That certainly echos one claim from the PyCUDA article. :) Are there libraries to help with this? First of all, since it's a thorny (and unsolved) problem, PyCUDA doesn't try to get involved in it. Support it--yes, involved--no. That said, I'm not aware of libraries that make autotunig significantly easier. Nicolas mentioned that he's eyeing some machine learning techniques like the ones in Milepost gcc. Nicolas, care to comment? Aside from that, Cray's grouped, attributed orthogonal search [1] sounds useful. [1] http://iwapt.org/2009/slides/Adrian_Tate.pdf Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] trouble with pycuda-0.93.1rc2 - test_driver.py, etc.
On Freitag 20 November 2009, Janet Jacobsen wrote: Hi, Andreas. I ran ldd /usr/common/usg/python/2.6.4/lib/python2.6/site-packages /pycuda-0.93.1rc2-py2.6-linux-x86_64.egg/pycuda/_driver.so and got libboost_python.so.1.40.0 = /usr/common/usg/boost/1_40_0/pool/lib/libboost_python.so.1.40.0 (0x2b259cac6000) libboost_thread.so.1.40.0 = /usr/common/usg/boost/1_40_0/pool/lib/libboost_thread.so.1.40.0 (0x2b259cd15000) libcuda.so.1 = /usr/lib64/libcuda.so.1 (0x2b259cf43000) libstdc++.so.6 = /usr/common/usg/gcc/4.4.2/lib64/libstdc++.so.6 (0x2b259d3df000) libm.so.6 = /lib64/libm.so.6 (0x2b259d6eb000) libgcc_s.so.1 = /usr/common/usg/gcc/4.4.2/lib64/libgcc_s.so.1 (0x2b259d96e000) libpthread.so.0 = /lib64/libpthread.so.0 (0x2b259db85000) libc.so.6 = /lib64/libc.so.6 (0x2b259dda) libutil.so.1 = /lib64/libutil.so.1 (0x2b259e0f6000) libdl.so.2 = /lib64/libdl.so.2 (0x2b259e2fa000) librt.so.1 = /lib64/librt.so.1 (0x2b259e4fe000) libz.so.1 = /usr/lib64/libz.so.1 (0x2b259e707000) /lib64/ld-linux-x86-64.so.2 (0x00316ec0) Does this help? Hmm, yes and no. I'm starting to believe that Boost built itself thinking that you have an UCS4 Python, while your actual build is UCS2. To confirm that latter point, run the equivalent of objdump -T /users/kloeckner/mach/x86_64/pool/lib/libpython2.6.so.1.0|grep UCS That should tell you what the UCS'iness of your custom Python is. Then run objdump -T /usr/common/usg/boost/1_40_0/pool/lib/libboost_python.so.1.40.0 | grep UCS to establish Boost's UCS'iness. As I said, I'm suspecting that the two might disagree. You might want to try that against your system Python 2.4, too. Maybe Boost cleverly found that one and picked it up. In any case, the switch to look for is Py_UNICODE_SIZE in pyconfig.h. P.S. Sorry if this should be off-list, but email sent to li...@monster.tiker.net is returned to me. On-list is the right place IMO--this creates a searchable record of problems and solutions. Thanks for asking though. (Btw: where'd you get that email address? While tiker.net is my domain, I don't recall using or having created that address.) HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Install PyCUDA on opensuse 11.1(x86_64)
On Donnerstag 12 November 2009, Jonghwan Rhee wrote: Hi there, I have tried to install pycuda on opensuse 11.1. However, when I did build, the following error occurred. What version of Boost do you have, how and where was it installed? Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Install PyCUDA on opensuse 11.1(x86_64)
Check out virtualenv. Andreas PS: *Please* try to keep the replies on-list. Thanks. On Freitag 13 November 2009, you wrote: Hi Andreas, Thanks for your help. It worked well. But another problem occurred when I do make install as follows. ctags -R src || true /usr/bin/python setup.py install Extracting in /tmp/tmpKkbJcX Now working in /tmp/tmpKkbJcX/distribute-0.6.4 Building a Distribute egg in /home/jrhee/pycuda /home/jrhee/pycuda/setuptools-0.6c9-py2.6.egg-info already exists /home/jrhee/pool/include/boost-1_36 /boost/ python .hpp *** Cannot find Boost headers. Checked locations: /home/jrhee/pool/include/boost-1_36/boost/python.hpp /home/jrhee/pool/lib / lib boost_python .so /home/jrhee/pool/lib / lib boost_python .dylib /home/jrhee/pool/lib / lib boost_python .lib /home/jrhee/pool/lib / boost_python .so /home/jrhee/pool/lib / boost_python .dylib /home/jrhee/pool/lib / boost_python .lib *** Cannot find Boost Python library. Checked locations: /home/jrhee/pool/lib/libboost_python.so /home/jrhee/pool/lib/libboost_python.dylib /home/jrhee/pool/lib/libboost_python.lib /home/jrhee/pool/lib/boost_python.so /home/jrhee/pool/lib/boost_python.dylib /home/jrhee/pool/lib/boost_python.lib /home/jrhee/pool/lib / lib boost_thread .so /home/jrhee/pool/lib / lib boost_thread .dylib /home/jrhee/pool/lib / lib boost_thread .lib /usr/local/cuda /bin/ nvcc /usr/local/cuda/include / cuda .h /usr/local/cuda/lib / lib cuda .so /usr/local/cuda/lib / lib cuda .dylib /usr/local/cuda/lib / lib cuda .lib /usr/local/cuda/lib / cuda .so /usr/local/cuda/lib / cuda .dylib /usr/local/cuda/lib / cuda .lib /usr/local/cuda/lib / cuda .so /usr/local/cuda/lib / cuda .dylib /usr/local/cuda/lib / cuda .lib /usr/local/cuda/lib / cuda .so /usr/local/cuda/lib / cuda .dylib /usr/local/cuda/lib / cuda .lib *** Cannot find CUDA driver library. Checked locations: /usr/local/cuda/lib/libcuda.so /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.lib /usr/local/cuda/lib/cuda.so /usr/local/cuda/lib/cuda.dylib /usr/local/cuda/lib/cuda.lib /usr/local/cuda/lib/cuda.so /usr/local/cuda/lib/cuda.dylib /usr/local/cuda/lib/cuda.lib /usr/local/cuda/lib/cuda.so /usr/local/cuda/lib/cuda.dylib /usr/local/cuda/lib/cuda.lib *** Note that this may not be a problem as this component is often installed system-wide. running install Checking .pth file support in /usr/local/lib64/python2.6/site-packages/ error: can't create or remove files in install directory The following error occurred while trying to add or remove files in the installation directory: [Errno 2] No such file or directory: '/usr/local/lib64/python2.6/site-packages/test-easy-install-24814.pth' The installation directory you specified (via --install-dir, --prefix, or the distutils default setting) was: /usr/local/lib64/python2.6/site-packages/ This directory does not currently exist. Please create it and try again, or choose a different installation directory (using the -d or --install-dir option). make: *** [install] Error 1 Jong On Sat, Nov 14, 2009 at 12:51 PM, Andreas Klöckner li...@informa.tiker.netwrote: On Freitag 13 November 2009, Jonghwan Rhee wrote: Hi Andreas, Its version is boost 1.36 and it was installed at /usr/include/boost/ by YaST package repositories. According to http://packages.opensuse-community.org, it appears that opensuse uses non-standard names for the Boost libraries. Stick BOOST_PYTHON_LIBNAME=boost_python BOOST_THREAD_LIBNAME=boost_thread into your siteconf.py. That should get you a step further. HTH, Andreas PS: Please take care to keep replies on the mailing list. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] pycuda patch for 'flat' eggs
Hi Maarten, On Dienstag 03 November 2009, you wrote: i've been using your pycuda package to play with, and I really like it! much more productive than compiling etc.. I have pycuda installed with --single-version-externally-managed and a different prefix. This causes pycuda not to find the header files. I've attached the diff and new compiler.py file to fix this. Merged in release-0.93 and master. Thanks for the patch, Andreas PS: Please direct stuff like this to the mailing list next time. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Pitch Linear Memory Textures
On Montag 26 Oktober 2009, Robert Manning wrote: PyCUDA users, I've been trying to find how to use pitch 2D linear memory textures in pyCUDA and have been unsuccessful. I've seen C code that uses cudaBindTexture2D and similar functions but it is not accessible (to my knowledge) by pyCUDA. Any suggestions? Thanks, Bob Manning See test_2d_texture() in test/test_driver.py. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Has anyone made a working parallel prefix sum/scan with pycuda?
On Montag 19 Oktober 2009, Michael Rule wrote: I'm convinced that I need a prefix scan that gives me access to the resultant prefix scanned array. so, for example, using addition, I would like a function that takes : 1 1 1 1 1 1 1 1 1 1 to 0 1 2 3 4 5 6 7 8 9 It seems like this data should be generated as an intermeidiate step in executing a ReductionKernel. I have not been able to figure out how this data is accessed by browsing the GPUArray documentation. Am I missing something obvious ? Parallel Prefix Scan is presently not implemented in PyCUDA. While reduction is related, the scan is actually a somewhat different animal. PScan would be a most welcome addition to PyCUDA, however. Mark Harris has written a good introduction on how to implement it: http://is.gd/4rXq0 If you decide to follow Mark's guide, almost half your work is already done for you--reduction occurs as part of the prefix scan, so you'll be able to recycle a fair bit of code. HTH, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
[PyCUDA] On Python 2.6.3 / distribute / setuptools
Hi all, -- This is relevant to you if you are using Python 2.6.3 and you are getting errors of the sort: /usr/local/lib/python2.6/dist-packages/setuptools-0.6c9- py2.6.egg/setuptools/command/build_ext.py, line 85, in get_ext_filename KeyError: '_cl' -- It seems Python 2.6.3 broke every C/C++ extension on the planet that was shipped using setuptools (which includes PyCUDA, PyOpenCL, and many more of my packages.) Thanks for your patience as I've worked through this mess, and to both Allan and Christine, I'm sorry you've had to deal with this, and thanks to Allan for pointing me in the right direction. To make a long story short, I've switched my packages (including PyCUDA) to use distribute instead of setuptools. All these changes are now in git. I'm not sure this will help if a 2.6.3 user already has setuptools installed, but I hope it will at least not make any other case worse. All in all, this seems like the least bad option given that I expect distribute to be the way of the future. Before I unleash this change full-scale, I would like it to get some testing. For this purpose, I've created a PyCUDA release candidate package, here: http://pypi.python.org/pypi/pycuda/0.93.1rc1 PLEASE TEST THIS, and speak up if you do--both positive and negative comments are much appreciated. Andreas PS: Once I have reasonable confirmation that this works for PyCUDA, I'll also release updated versions of PyOpenCL, meshpy, boostmpi, pyublas, The relevant changes are *already in git* if you'd like to try them now. PPS: Deciding in favor of distribute and against the promised setuptools update was based on two factors: - Primarily, distribute makes a fix for the 2.6.3 issue available *now*. - Secondarily, I personally disliked the behavior of PJE, the author of setuptools, in response to the current mess. signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] [Hedge] On Python 2.6.3 / distribute / setuptools
On Donnerstag 15 Oktober 2009, Andreas Klöckner wrote: Hi all, -- This is relevant to you if you are using Python 2.6.3 and you are getting errors of the sort: /usr/local/lib/python2.6/dist-packages/setuptools-0.6c9- py2.6.egg/setuptools/command/build_ext.py, line 85, in get_ext_filename KeyError: '_cl' -- A quick addition: If you are already encountering this error, you need to *remove* setuptools before the fix will work for you. That means that if you do import setuptools on the Python shell and it succeeds, type setuptools.__file__ to see where it is installed and get rid of it, then start over. (After the fix has worked, it will say somehting with distribute in the path for the setuptools.__file__. That's fine.) Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] [Hedge] On Python 2.6.3 / distribute / setuptools
On Donnerstag 15 Oktober 2009, Darcoux Christine wrote: Hi Andreas 0.93.1rc1 seems to works for me, except that I had to download a distribute_setup.py file. Could you include that file in the source tarball ? I think this is the usual way to work with Distribute, since the documentation say : Whoops. Good point. This should be fixed in 0.93.1rc2, here: http://pypi.python.org/pypi/pycuda/0.93.1rc2 I ran the examples/demo.py with success, but the transpose demo crashed. I assume this is not related to the use of Distribute, but here is the trace : File transpose.py, line 205, in module run_benchmark() File transpose.py, line 165, in run_benchmark target = gpuarray.empty((size, size), dtype=source.dtype) File /usr/lib/python2.6/site-packages/pycuda/gpuarray.py, line 81, in __init__ self.gpudata = self.allocator(self.size * self.dtype.itemsize) pycuda._driver.MemoryError: cuMemAlloc failed: out of memory Well, I'm not sure transpose.py adapts to the amount of memory you have. It works on my ~900M card. If you'd like to prepare a fix, I'd certainly merge it. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] 2d scattered data gridding
Hi Roberto, On Freitag 02 Oktober 2009, Roberto Vidmar wrote: I wonder if it is possible to use PyCUDA to grid on a two dimensional regular grid xyz scattered data. Our datasets are usually quite large (some millions of points) . Many thanks for any help in this topic. As usual, if something is possible with CUDA in general, it's also possible with PyCUDA. In this specific case, I'm not sure what you mean by gridding-- making a grid-based histogram, binning, or perhaps something entirely different? Nonetheless, it seems likely that what you want can be (and likely has been) done with CUDA. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
[PyCUDA] Nvidia GTC: PyCUDA talk, Meetup
Hi all, If you are attending Nvidia's GPU Technology Conference next week, there are two things I'd like to point out: - I'll be giving a talk about PyCUDA on Friday, October 2 at 2pm, where I'll both introduce PyCUDA and talk about some exciting new developments. The talk will be 50 minutes in length, and I'd be happy to see you there. - Also, I'd like to propose a PyCUDA meetup on Thursday, October 1 at noon. (ie. lunchtime) I'll be hanging out by the Terrace seminar room around that time. I'm looking forward to meeting some of you in person. See you next week, Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Nvidia GTC: PyCUDA talk, Meetup
On Mittwoch 23 September 2009, Andreas Klöckner wrote: Hi all, If you are attending Nvidia's GPU Technology Conference next week, there are two things I'd like to point out: - I'll be giving a talk about PyCUDA on Friday, October 2 at 2pm, where I'll both introduce PyCUDA and talk about some exciting new developments. The talk will be 50 minutes in length, and I'd be happy to see you there. Whoops--that's wrong. I just realized the talk is at 1pm on Friday. Sorry for the noise. - Also, I'd like to propose a PyCUDA meetup on Thursday, October 1 at noon. (ie. lunchtime) I'll be hanging out by the Terrace seminar room around that time. I'm looking forward to meeting some of you in person. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] problem with pycuda._driver.pyd
On Donnerstag 17 September 2009, mailboxalpha wrote: I looked for DLL files used in _driver.pyd and one of them is named nvcuda.dll. There is no such file on my machine. Perhaps that is the DLL file that could not be found. The required boost dlls have been copied to windows\system32 directory and the boost lib directory has been added to the system path. Can you please check which DLL the CUDA examples require? There will be one that has the runtime interface, which will likely be called something with cudart. You don't care about that one. Instead, that DLL in turn requires the driver interface, and that's the DLL name we need. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Installing PyCuda 0.93 on CentOs 5.3
On Montag 17 August 2009, Christian Quaia wrote: Hi. I've been trying to install pycuda on my centos 5.3 box, but I haven't had much success. I managed to install boost (1.39) as per instructions, but when I build pycuda I get the following error: building '_driver' extension gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp -I/usr/include/boost-1_39 -I/usr/local/cuda/include -I/usr/lib64/python2.4/site-packages/numpy/core/include -I/usr/include/python2.4 -c src/wrapper/wrap_cudadrv.cpp -o build/temp.linux-x86_64-2.4/src/wrapper/wrap_cudadrv.o src/wrapper/wrap_cudadrv.cpp: In function ‘intunnamed::function_get_lmem(const cuda::function)’: src/wrapper/wrap_cudadrv.cpp:165: error: ‘PyErr_WarnEx’ was not declared in this scope ... error: command 'gcc' failed with exit status 1 I looked around for a solution, but I couldn't find any. I'm using Python 2.4.3, and my default gcc version is 4.1.2 (although I had to compile boost using gcc43) Try the git version, PyErr_WarnEx is not referenced any more. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] Installing PyCuda 0.93 on CentOs 5.3
That says that you're linking against the boost you built with gcc 4.3-- rebuild boost with 4.1, that you should get you a step further. Andreas On Dienstag 18 August 2009, Christian Quaia wrote: Thanks Andreas. Sorry, I should have tried that before... Now the build and install work. However, when I run the tests I get another error: Traceback (most recent call last): File ./test/test_driver.py, line 475, in ? import pycuda.autoinit File /usr/lib64/python2.4/site-packages/pycuda-0.94beta-py2.4-linux-x86_64.egg/ pycuda/autoinit.py, line 1, in ? import pycuda.driver as cuda File /usr/lib64/python2.4/site-packages/pycuda-0.94beta-py2.4-linux-x86_64.egg/ pycuda/driver.py, line 1, in ? from _driver import * ImportError: /usr/lib64/libboost_python-gcc43-mt-1_39.so.1.39.0: undefined symbol: _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3 _l Thanks again for your help, Christian On Mon, Aug 17, 2009 at 5:41 PM, Andreas Klöcknerli...@informa.tiker.net wrote: On Montag 17 August 2009, Christian Quaia wrote: Thanks Andreas. I got the git version, and things went a bit better, but I still have a problem. When I build pycuda now I get this error: /usr/include/boost-1_39/boost/type_traits/detail/cv_traits_impl.hpp:37: internal compiler error: in make_rtl_for_nonlocal_decl, at cp/decl.c:5067 This is due to a bug in gcc 4.1.2, which was fixed in later versions. For this very reason I had to compile boost using gcc43, which is also installed on my machine in parallel to gcc. Is there a simple way, like for boost, to force pycuda to be built using gcc43 as a compiler? Not easily, as Python should be built with a matching compiler. But see this here for gcc 4.1 help: http://is.gd/2lCd7 Andreas ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] a way to probe what globals, especially constant arrays and texture refs are defined in a kernel?
On Mittwoch 12 August 2009, Andrew Wagner wrote: Hi- Is there a way to get a list of the global variables, especially constant arrays and texture refs that are defined in a kernel? I'm generating a pycuda.driver.Module from a template, and the storage of various kernel inputs depends on the template parameters. It would be convenient for code using a kernel generated this way to have some way of figuring out what global variables are defined in the kernel, and whether they are globals, constants, or texrefs. Maybe in a future version of pycuda it would be nice to replace (or provide an alternative to) the accessor functions: pycuda.driver.Module.get_global pycuda.driver.Module.get_function pycuda.driver.Module.get_texref that consists of having member variables: pycuda.driver.Module.globals pycuda.driver.Module.functions pycuda.driver.Module.constants pycuda.driver.Module.texrefs that are already initialized to dictionaries with the name of the variable as the key, and the handle (or maybe a (handle, size) tuple) as the value. or maybe have a single member variable pycuda.driver.Module.globals that is a dictionary with variable names as keys, and a (type, handle, size) tuple or something similar. If I at least have the name of the variable I think I can deduce if the variable is defined as a __constant__ array by wrapping pycuda.driver.Module.get_global in a try: statement, but that's rather un-pythonic Or perhaps I'm misunderstanding something and the Module.get_* functions are forced on us by the CUDA API? Sorry, no way to do that by the CUDA API. One could potentially parse the CUBIN file, but that's rather fragile and not something that PyCUDA engages in, so far. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] [PyCuda] Installation wiki updated with Windows Vista 64 bit install with Visual Studio 2008
On Mittwoch 12 August 2009, wtftc wrote: I've updated the wiki with my configuration of Windows Vista 64 bit with Visual Studio 2008. To use it, your entire python stack must also be 64 bit. The build was x64, also known as amd64. Thanks! I also have a question: is it possible to statically compile all embedded kernels in my code with pycuda? Deploying a program with pycuda widely with because it requires the cuda and c++ build tools, which are heavy. It would be nice to have an option to generate a library at build time that could then be packaged and installed without having to do the heavy c lifting. That's very possible--this would amount to preseeding the PyCUDA compiler cache. I'd certainly merge a patch that implements this. The PyCUDA compiler cache logic is ~200 lines, so this should be easy to add. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] non-egg install broken with 0.93
On Mittwoch 12 August 2009, Ryan May wrote: Hi, I was trying to install pycuda today from source (in advance of the Scipy tutorial!), and have noticed a problem if I don't use eggs to install. If I use: python setup.py install --single-version-externally-managed --root=/ I get the pycuda header file installed under usr/include/cuda. This breaks the logic in compiler._find_pycuda_include_path(). I personally avoid eggs, so this creates a problem. Also, many linux package managers (including Gentoo, my own distro) avoid eggs, and I know for a fact that Gentoo uses this same method to install packages. I've hacked my own compiler.py to work, but I'm not sure what a good solution really would be. Gentoo has 0.92 packaged, but I don't think the header was used in that version and thus didn't present any problems. I've committed a (somewhat hacky) fix: Automatically check /usr and /usr/local on Linux. Let me know if that works for you. Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] NVIDIA CUDA Visual Profiler?
On Freitag 31 Juli 2009, Ahmed Fasih wrote: Hi, I'm very surprised that google isn't turning up something about this topic because I thought it's been previously discussed, so my apologies if it has. I'm trying the NVIDIA CUDA Visual Profiler (v 2.2.05) in Windows XP with a fairly recent PyCUDA git, on CUDA 2.2 (pycuda.driver.get_driver_version() returns 2020). I provide the Visual Profiler with a Windows batch file that calls python my_pycuda_script.py -some -flags, but the Visual Profiler (after running the script 4 times) only reports two methods, memcopy. All other counters are zero (so it doesn't display them in the table). Manipulating the counters enabled doesn't change this. Any assistance would be much appreciated. My application runs only ~10% faster on a Tesla C1060 than a G80 Quadro (despite having twice as many MPs) so I'm hoping the profiler will help me understand why. On Linux, I've had good success with just using the profiler from the command line: http://webapp.dam.brown.edu/wiki/SciComp/CudaProfiling Every one of my attempts to achieve the same thing using the visual profiler has ended in tears so far. I'm not sure if the command line way of doing things works in Windows, but I'd imagine so. Once you figure out what's up, please add an FAQ entry! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] casting arguments to memset to unsigned int ?
On Mittwoch 01 Juli 2009, Andreas Klöckner wrote: Would you mind adding FAQ items for these two? http://wiki.tiker.net/PyCuda/FrequentlyAskedQuestions Thanks for writing the FAQ item! FYI--I've slightly reworked and expanded it. http://is.gd/1mMDg Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net
Re: [PyCUDA] [PyCuda] pyCUDA Windows Installation
On Donnerstag 02 Juli 2009, faberx wrote: Dear all! You can install pycuda on windows xp! Please look at: http://wiki.tiker.net/PyCuda/Installation/Windows http://wiki.tiker.net/PyCuda/Installation/Windows Thanks for writing this up! Andreas signature.asc Description: This is a digitally signed message part. ___ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net