from:"Andreas Klöckner"

Re: [PyCuda] Questions about copying values into cubins

2008-11-25 Thread Andreas Klöckner

On Mittwoch 26 November 2008, Brad Zima wrote:
 Reading more of the documentation showed that only certain datatypes may be
 passed into the module, none of which are of int type. Is there something
 I'm missing here?  If not, is there a way to pass variables (i.e. image
 width and image height) into a SourceModule?

You're likely passing an int for the r_gpu parameter. The problem is that 
PyCuda is refusing to guess what type the Python 'int' may correspond to. You 
have to tell it.

There are two ways:

- Use numpy's sized integers in the direct invocation interface. (convenient)
- Use the prepared_call interface. [1] (recommended) In this case, you don't 
have to pass sized ints any more, since you've already specified arg types in 
prepare().

I've updated the docs to make this clearer.

HTH,
Andreas

[1] http://is.gd/94wA


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCuda Glitch, 0.91.1

2008-11-27 Thread Andreas Klöckner

Hi all,

Matthew Goodman reported (below) that PyCuda fails with CUDA 2.1 because nv 
made a whitespace change in the cubin files they write. First of all, a change 
like that shouldn't be able to take PyCuda down, so I've made sure that even 
if metadata extraction fails, PyCuda doesn't throw weird-looking exceptions. 
Next, I've fixed the actual bug along the lines of what Matt suggests. Since 
this bug may turn people away from PyCuda unnecessarily, I've released 0.91.1.

Happy hacking,
Andreas

On Mittwoch 26 November 2008, Matt G wrote:
 I found a silly little bug you might wanna know about:
 Some nvcc builds generate cubin files that your regex fails to account
 for:

 The line:
 reg  = 4
 (two spaces b/t 'reg' and '=' causes the:

 self.registers = int(re.search(reg = ([0-9]+), cubin).group(1))

 to return a None and thusly crash ungracefully.  Adding a strategic + to
 the regex seems to fix things:
 self.registers = int(re.search(reg += ([0-9]+), cubin).group(1))

 I attached my cubin file so you can take a looksee.

 Thanks for the awesome tools.
 I am gonna be taking on some big CFD chores only possible by your help!

 Thanks.
 -Matthew Goodman

 ps
 nvcc --version
 Copyright (c) 2005-2007 NVIDIA Corporation
 Built on Fri_Nov__7_06:20:13_PST_2008
 Cuda compilation tools, release 2.1, V0.2.1221

 NVIDIA 260 GTX








signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] newbie building on OSX 10.5

2008-12-05 Thread Andreas Klöckner

Hey Randy,

this here is the real problem:

On Freitag 05 Dezember 2008, Randy Heiland wrote:
 cc1plus: error: unrecognized command line option -Wno-long-double

Somebody compiled your Python with -Wno-long-double, but your C++ compiler 
does not understand that (warning-related, anyway) option. Easiest workaround: 
Open up aksetup_helper.py, in hack_distutils(), add -Wno-long-double to the 
list handed to remove_prefixes().

Good luck,
Andreas

PS: The ctags bit is irrelevant--that's just for my convenience when 
developing. I should probably remove that, anyway.

PPS: Yes, it's telling that there's a routine called hack_distutils().


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] test_driver.py problem

2008-12-14 Thread Andreas Klöckner

Hi Jyh-Shyong,

On Sonntag 14 Dezember 2008, Jyh-Shyong Ho wrote:
 I just installed pycuda-0.90.2 on my computer and I got the following
 error message when I ran the program test_drive.py:

 python test_drive.py

 Traceback (most recent call last):
   File test_driver.py, line 2, in module
 import pycuda.driver as drv
   File
 /usr/local/lib64/python2.5/site-packages/pycuda-0.90.2-py2.5-linux-x86_64.
egg/pycuda/driver.py, line 1, in module from _driver import *
 ImportError:
 /usr/local/lib64/python2.5/site-packages/pycuda-0.90.2-py2.5-linux-x86_64.e
gg/pycuda/_driver.so: undefined symbol:
 _ZN5boost6python9converter8registry9push_backEPFPvP7_objectEPFvS5_PNS1_30rv
alue_from_python_stage1_dataEENS0_9type_infoE

Likely you have system boost headers installed (like in /usr/include). These 
probably got picked up for the build and then ended up mismatching those of a 
separately installed newer boost. Just deinstall the system-wide boost dev 
stuff using rpm.

If that's not the case, then check with

ldd /usr/local/lib64/python2.5/site-packages/pycuda-0.90.2-py2.5-linux-
x86_64.egg/pycuda/_driver.so

that the right boost library is getting picked up and report back.

Also, why use an old version (0.90.2) of PyCuda? 0.91.1 is out and fixes 
numerous bugs. (and has more features)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] install witch python 2.4 anyone ?

2008-12-22 Thread Andreas Klöckner

On Montag 22 Dezember 2008, Jean-Christophe Penalva wrote:
   Hello,

   i try to install Pycuda on a host with python 2.4.

   All is good (boost, numpy), until i fall on a problem during the make of
 pycuda. There's a test on the version of python (for version = 2.5 ...),
 and that break my compil :
 src/cpp/cuda.hpp: In member function #8216;void
 cuda::memcpy_2d::set_src_host(boost::python::api::object)#8217;:
 src/cpp/cuda.hpp:942: error: #8216;Py_ssize_t#8217; was not declared in
 this scope src/cpp/cuda.hpp:942: error: expected `;' before
 #8216;len#8217; src/cpp/cuda.hpp:942: error: #8216;len#8217; was not
 declared in this scope

Hi Jean-Christophe,

I've ifdef'ed out the Py_ssize_t's for Py2.5 in git. Let me know if that does 
the trick or if further fixes are needed.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] getting access to the depth attribute in Memcpy3D object (small patch)

2009-01-15 Thread Andreas Klöckner

On Donnerstag 15 Januar 2009, Nicolas Pinto wrote:
 -  .def_readwrite(height, cl::Depth)
 +  .def_readwrite(depth, cl::Depth)

Applied.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] BUG curandom.rand

2009-01-16 Thread Andreas Klöckner

On Donnerstag 15 Januar 2009, Jozef Vesely wrote:
 Hello,

 I these random data do not seem very random ...
 http://www.kolej.mff.cuni.cz/~vesej3am/transfer/plot.png
 Moreover they are in [0, 0.5] range instead of [0, 1]
 Working rand function would be very usefull.

I see that there's an issue, but I'm not likely to have time to help you with 
it--sorry. :( If you trawl the nvidia cuda forums, there are a number of 
working RNGs posted. Maybe you can get one to work or use it to fix the 
current one. In any case, I'd be happy to receive a patch.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] codepy

2009-02-04 Thread Andreas Klöckner

Hey Nicholas,

On Freitag 30 Januar 2009, you wrote:
 Attached is a slightly hacky patch for pycuda to enable reading of
 source c files. Currently, includes must be in the same directory... I'm
 not sure what the best policy is for dependencies; gcc -P -xc++ seems to
 fail when encountering CUDA directives.

I like the sentiment of the patch, but I'd prefer if we could get away without 
the explicit CudaSourceFile, and instead default to automatic include 
processing. nvcc has an -M (or --generate-dependencies) flag. That could be 
used instead of makedepend, which is likely not around on Win32.

Opinions?

Andreas

PS: Cc'ed the PyCuda mailing list. You patch reattached for the ML to see.
diff --git a/src/python/driver.py b/src/python/driver.py
index a56e7fb..c0936fa 100644
--- a/src/python/driver.py
+++ b/src/python/driver.py
@@ -418,14 +418,16 @@ def _get_nvcc_version(nvcc):
 
 
 
-def _do_compile(source, options, keep, nvcc, cache_dir):
+def _do_compile(source, options, keep, nvcc, cache_dir, file_root, include_files):
 from os.path import join
 
 if cache_dir:
-import md5
-checksum = md5.new()
+import hashlib
+checksum = hashlib.md5()
 
 checksum.update(source)
+for fname in include_files:
+checksum.update(open(fname, r).read())
 for option in options: 
 checksum.update(option)
 checksum.update(_get_nvcc_version(nvcc))
@@ -439,8 +441,10 @@ def _do_compile(source, options, keep, nvcc, cache_dir):
 pass
 
 from tempfile import mkdtemp
+import shutil
 file_dir = mkdtemp()
-file_root = kernel
+for fname in include_files:
+shutil.copy(fname, file_dir)
 
 cu_file_name = file_root + .cu
 cu_file_path = join(file_dir, cu_file_name)
@@ -487,12 +491,39 @@ def _do_compile(source, options, keep, nvcc, cache_dir):
 
 
 
+class CudaSourceFile(object):
+def __init__(self, filename, calcdeps = True, include_files = []):
+import os
+filename = os.path.realpath(filename)
+self.text = #line 1 \%s\\n %(filename) + open(filename, r).read()
+self.filename = filename
+self.basename_root = os.path.basename(filename).replace(.cu, )
+self.include_files = include_files
+if calcdeps:
+import subprocess, re
+# note: possible problems with awkward filenames with : 
+mdout = subprocess.Popen([makedepend, -Y, -f-, filename],
+stdout=subprocess.PIPE).communicate()[0]
+self.include_files += re.findall(r^[\w\d \-_\./]+: (.+)$,
+mdout, flags=re.M)
+
+def __str__(self):
+return self.text
+
 class SourceModule(object):
 def __init__(self, source, nvcc=nvcc,
 options=[], keep=False,
 no_extern_c=False, arch=None, code=None,
 cache_dir=None):
 
+if isinstance(source, CudaSourceFile):
+tmp_file_root = source.basename_root
+include_files = source.include_files
+source = str(source)
+else:
+tmp_file_root = kernel
+include_files = []
+
 if not no_extern_c:
 source = 'extern C {\n%s\n}\n' % source
 
@@ -519,7 +550,8 @@ class SourceModule(object):
 if code is not None:
 options.extend([-code, code])
 
-cubin = _do_compile(source, options, keep, nvcc, cache_dir)
+cubin = _do_compile(source, options, keep, nvcc, cache_dir,
+tmp_file_root, include_files)
 
 def failsafe_extract(key, cubin):
 pattern = r%s\s*=\s*([0-9]+) % key


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] codepy

2009-02-04 Thread Andreas Klöckner

On Samstag 31 Januar 2009, Nicholas Tung wrote:
 Sure, no problem. Also, another question: how do I pass an offset from a
 DeviceAllocation to a kernel? I can't seem to pass int's where it wants
 pointers despite being able to cast DeviceAllocation to an int.

Two ways: wrap it in a numpy.intp--a pointer sized integer. Or use the 
prepared invocation interface, which handles this seamlessly.

Current git has test cases for this.

There's still an issue with offsets in memcpy invocations, but we'll leave 
that for another day. (patches welcome, though)

 Also, some minor but hopefully helpful doc fixes are attached.

Applied.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] opengl PBOs and pycuda

2009-02-10 Thread Andreas Klöckner

On Dienstag 10 Februar 2009, Peter Berrington wrote:
 I didn't see any response on the mailing list so I thought I'd ask again.
 Has anyone made any attempt at wrapping the cuda opengl interop functions
 in pycuda? I've never used boost before and I'm not sure how to proceed.
 I'd really like to be able to use pycuda to post process an opengl pixel
 buffer object but without the cudaGLRegisterBufferObject and
 cudaGLMapBufferObject functions I don't see how I can do this.

I predict that after reading this email, you'll be very happy.

Check recent git, configure with --cuda-enable-gl=1. (or CUDA_ENABLE_GL=True 
in siteconf.py, whichever style you prefer)

It compiles for me, but is otherwise entirely untested. I would like you to do 
three things in return:

- Test and report back with the result. (I have no way of doing so.)
- Write the simplest possible example that does something useful and submit it 
for inclusion.
- Fill out the documentation in doc/source/gl.rst and submit a patch.
Also see http://documen.tician.de/pycuda/gl.html.

 I saw there was a recent release of codepy; would codepy allow one include
 and call the C cuda opengl functions directly or am I misunderstanding how
 it works? 

codepy in principle would allow you to do that, but since PyCuda already 
builds a Boost.Python extension as part of its build, that would be a rather 
roundabout and non-portable (Linux-only) way of doing it.

 Thanks for your response.

You're welcome. :)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] opengl PBOs and pycuda

2009-02-11 Thread Andreas Klöckner

On Mittwoch 11 Februar 2009, Peter Berrington wrote:
 Thanks a lot for working on that Andreas. I'd be more than happy to write a
 tutorial/example and documentation, but I wasn't able to get it working
 when I played around with it earlier today. I assume that I should first
 call import pycuda.gl.autoinit to set up a context, however that throws
 this error:
 pycuda._driver.LogicError: cuGLInit failed: invalid context

 If I instead use the standard pycuda autoinit and then try to call
 pycuda.gl.init() directly it instead fails with
 pycuda._driver.Error: cuGLInit failed: unknown

 I browsed around in the source but I'm afraid I couldn't see anything that
 looked obviously wrong, although there was apparently a mispelling in the
 gl/autoinit.py file: The function pycuda.gl.make_gl_context(device) is
 called but it looks that module only has a make_context() function, not a
 make_gl_context function.

Fixed, thanks.

 Am I misunderstanding how to go about initializing pycuda? If not, is there
 any more information I can provide that might help you identify why
 cuGLInit function call fails? Thanks!

I've googled around a bit and written up my findings at

http://documen.tician.de/pycuda/gl.html

(Pull recent git to get the updated autoinit.)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Pytools is not listed as a prerequisite on PyCuda's home page

2009-03-01 Thread Andreas Klöckner

Hi Peter,

On Sonntag 01 März 2009, you wrote:
 PyCuda looks very good - I'm excited to use it. But I found that I am
 missing pytools, and I see no mention of it... anywhere. At least not in
 the documents or on your home page..

What kind of failure do you get? (Please answer this even if the text below 
solves the problem for you.)

Pytools is a rather behind-your-back dependency of PyCuda. If you don't have 
it already, the 'setup.py' script should automatically go and get it for you. 
If that doesn't work, you can get it from its Python package index page, here:

http://pypi.python.org/pypi/pytools

HTH,
Andreas




signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] An error:

2009-03-18 Thread Andreas Klöckner

On Mittwoch 18 März 2009, William King wrote:
 I'm glad to help. I'd like to try to get a test together that will
 compute some algorithm on the GPU and one on the CPU and compare the
 speeds.

 Do you think it would be possible to define the algorithm once, and
 choose where it is executed?

Well, if you use any of the CUDA features at all (which you likely want to), 
then that's going to be somewhat difficult.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCuda starting examples:

2009-03-18 Thread Andreas Klöckner

On Mittwoch 18 März 2009, William King wrote:
 Ok, for someone new to python(but very familiar with perl, java, php)
 and new to CUDA. I have been able to setup my test enviroment and get
 the test_driver.py and all to run properly.

 I would like to help document(while I learn) how to use PyCuda to build
 libraries of algorithms to run in a GPU enviroment. My eventual goal is
 to be able to integrate PyCuda into a distributed computing management
 system such as Boinc.

 I'm willing to document my steps if someone can help figure out the
 CUDA/PyCUDA/Python side of things.

Just post to the list if you have questions. More documentation is, of course, 
always welcome. Look at the doc/source/ directory and just add to/restructure 
tutorial.rst, or start an entirely new section.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCuda] Reduction

2009-04-08 Thread Andreas Klöckner

Hi all,

Exciting news! I've just committed code to do reductions into PyCUDA's git. 
Sums and inner products are available now. Norms, maxima, minima and many 
other things can be added with only a few lines of code. (Any takers? :)

Here's the code:
http://git.tiker.net/pycuda.git/blob/HEAD:/src/python/reduction.py

Relatedly: we're nearing a 0.93 release--there's more than enough new stuff in 
the repository. I'll push out a release candidate soon.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCuda] Double precision textures

2009-04-08 Thread Andreas Klöckner

Hi all,

Another piece of niceness just landed in PyCUDA git: Semi-transparent support 
for double-precision textures. This code papers over the complications and 
differences between floats and doubles as far as texturing is concerned. 
Here's how you use it.

Device-side:
8 
#include pycuda-helpers.hpp

texturefp_tex_{float,double}, N, cudaReadMode... my_tex;

{float,double} x = fp_tex1Dfetch(my_tex, i);
{float,double} y = fp_tex2D(my_tex, i, j);
{float,double} z = fp_tex3D(my_tex, i, j, k);
8 

(The proper inlcude path for pycuda-helpers.hpp is automatically set by 
PyCUDA--no need to jump through extra hoops.)

Host-side, everything should be transparent if you use
http://documen.tician.de/pycuda/array.html#pycuda.gpuarray.GPUArray.bind_to_texref_ext
to bind a GPUArray to the texture reference. Note: only single-channel 
textures are supported by this abstraction for now. For multi-D or other 
binding paths, just set up the texture as if it consisted of int2's instead of 
doubles.

I'd like to thank Nathan Bell at Nvidia for help in figuring out how to get 
this to work.

The relevant bits in the code are here:
- http://git.tiker.net/pycuda.git/blob/HEAD:/src/cuda/pycuda-helpers.hpp
- Test: http://git.tiker.net/pycuda.git/blob/HEAD:/test/test_driver.py#l388

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Double precision textures

2009-04-09 Thread Andreas Klöckner

On Mittwoch 08 April 2009, Andreas Klöckner wrote:
 Host-side, everything should be transparent if you use
 http://documen.tician.de/pycuda/array.html#pycuda.gpuarray.GPUArray.bind_to
_texref_ext to bind a GPUArray to the texture reference. Note: only
 single-channel textures are supported by this abstraction for now. For
 multi-D or other binding paths, just set up the texture as if it consisted
 of int2's instead of doubles.

Quick update: The host-side functionality is now conditioned on specifying an 
allow_double_hack=True keyword argument, to provide forward compatibility in 
case the texture units ever pick up real double precision support.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] demo_struct.py warning

2009-04-11 Thread Andreas Klöckner

On Samstag 11 April 2009, Randy Heiland wrote:
 Just curious - what does the warning mean?

There was a stray __global__ in that code, which I presume was an attempt to 
get rid of the two advisories. Turns out that in certain situations there 
appears to be no way of telling nvcc what kind of pointer you're giving it. 
demo_struct hits one of these cases.

I've removed the __global__ in git, the advisories remain.

Andreas

 $ python demo_struct.py
 original arrays
 [ 1.  2.  3.]
 [ 0.  4.]
 kernel.cu(5): warning: invalid attribute for member
 DoubleOperation::ptr

 /tmp/tmpxft_0904_-7_kernel.cpp3.i(12): Advisory: Cannot
 tell what pointer points to, assuming global memory space
 /tmp/tmpxft_0904_-7_kernel.cpp3.i(12): Advisory: Cannot
 tell what pointer points to, assuming global memory space



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] The method how I installed pyCuda in windowsXP

2009-04-15 Thread Andreas Klöckner

Hi there,

here are a few comments on your truly heroic effort. :)

On Mittwoch 15 April 2009, B3design.jp wrote:
 4.compiled Boost C++ libraries by mingw
 type commandPrompt
   cd c:\program files\boost\boost_1_38_0
   bjam.exe --toolset=gcc --build-type=complete stage

 I wait for huge time... ... ...

There are a few tricks on how to shrink the boost install time on this page 
here:

http://mathema.tician.de/software/install-boost

If you have a reasonably recent machine (dual-core or better), building boost 
should take no more than a few minutes from start to finish (in Linux--no idea 
about MingW though.)

 It seems that windows becomes the error because os.getuid() is not
 supported.
 I follow, and it is ... by installation of LINUX

Oh great. Is there anything that Windows gets right? :)

Well, thanks for reporting this. I've hacked up a stopgap measure that uses an 
md5sum of $HOME instead of an uid if getuid() is unavailable. This is in git. 
Let me know what I've broken in the process.

 I was able to do it...
 Oops! I forgot what I was going to do.

:)

Thank you very much for posting this!

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCuda] Fwd: Re: [minor] pagelocked_empty args

2009-04-19 Thread Andreas Klöckner

Whoops--this should've gone to the list.

Andreas
---BeginMessage---
Hi Nicholas,

(first off sorry for misspelling your name earlier)

On Freitag 03 April 2009, you wrote:
 The shape argument for numpy.empty() can be a scalar for creation of
 1-D arrays. If you refer to numpy documentation [in pycuda documentation]
 probably the api should be the same. Not a big issue obviously.

Fixed for pagelocked_empty() and all GPUArray constructors.

Andreas


signature.asc
Description: This is a digitally signed message part.
---End Message---


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Bad file descriptor error with PyCuda 0.92

2009-04-25 Thread Andreas Klöckner

On Samstag 25 April 2009, you wrote:
 Hello,

 Thank you all for your help.
 Upon further investigation, it seems that it is not PyCuda's fault.
 Eric4.4.4 seems to be unhappy with PyCuda.  Eric4.4.3 works fine.  If you
 would stil like to read about it, the details are below.

This is somewhat strange. If it turns out that this is something that PyCUDA 
has any influence on, please let me know. Would you mind exercising the 
subprocess module (without PyCUDA's help) from within Eric and see if that 
works?

 By the way, when I build PyCuda 0.93, I get an error that
 -lboost_thread-gcc43-mt is not found in /usr/bin/ld.  I went to /usr/bin
 and indeed, there is no /ld  .  Where should this come from?

Maybe binutils is not installed? (kind of unlikely) If you type which ld, 
what do you get?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCuda Digest, Vol 10, Issue 14

2009-04-27 Thread Andreas Klöckner

On Montag 27 April 2009, Matt G wrote:
 Quick fix is using python2.5.  It looks like python2.6 is the default that
 python is hooked to under Jaunty.  They made some changes to subprocess I
 don't have time dig into, but it is causing problems with some of my code
 as well.  Also notably, if you used easy_install python2.6 probably wont
 find most of the libraries you installed in that fashion.

Is this Ubuntu-specific? I'm having no trouble with vanilla 2.6.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] pycuda on ubuntu 9.04?

2009-04-29 Thread Andreas Klöckner

On Mittwoch 29 April 2009, Brent Pedersen wrote:
 hi, i've never tried cuda or pycuda before, but i'm trying now on a
 clean system with ubuntu 9.04.
 i downloaded the install script for linux-64 ubuntu 8.04 here:
 http://www.nvidia.com/object/cuda_get.html

 will that work for 9.04? must one wait for the 9.04 version to appear?
 or are sources available?

 for now, i get cuda and pycuda to install but when i try it out, i see
 the output below.
 any ideas on what i can try?
 thanks,
 -brent

 $ python -c import pycuda.autoinit
 Traceback (most recent call last):
   File string, line 1, in module
   File
 /usr/lib/python2.5/site-packages/pycuda-0.92-py2.5-linux-x86_64.egg/pycuda
/autoinit.py, line 5, in module
 cuda.init()
 Boost.Python.ArgumentError: Python argument types in
 pycuda._driver.init()
 did not match C++ signature:
 init(unsigned int flags=0)

Something seems fishy here: That snippet says python2.5 in various places, 
but I seem to recall that the default Python version in 9.04 should be 2.6. 
Can you shed some light here?

This looks like an issue with Boost.Python--the flags argument default 
assignment isn't recognized somehow.

Also, can you send the output of

ldd /usr/lib/python2.5/site-packages/pycuda-0.92-py2.5-linux-
x86_64.egg/pycuda/_internal.so

What version of boost are you using?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Odd results from PyCUDA port of options pricing example

2009-04-29 Thread Andreas Klöckner

I see one potential issue:

  b_gpu = cuda.mem_alloc(len(h_OptionData))

You probably mean len(h_OptionData)*numpy.dtype(numpy.float32).itemsize,
which is a long way of saying len(h_OptionData)*4.

Word of advice: Use gpuarrays. Less foot-shooting potential.

Andreas

On Mittwoch 29 April 2009, Raefer Gabriel wrote:
 Hi all,

 I am trying to port the binomial options example from the CUDA SDK to
 pyCuda as a learning  exercise primarily, and I seem to be getting
 incorrect results and am having trouble tracking down the problem.  I am
 new to CUDA and pyCuda.

 Source code is here (no external dependencies other than pycuda - I'm
 running Python 2.6 here on Ubuntu 9.04, and CUDA and pyCuda pass all
 included tests as working fine):
 http://www.alaricuscapital.com/pycuda-binomial.txt

 I know that the basic logic in the kernel is correct, since it is copied
 from the binomialOptions example in the SDK.  However, I had to make a few
 minor tweaks to it - namely, instead of using static arrays, I am passing
 in the options input data as array arguments to the kernel, and I copied in
 some #defines from the header to make it compile smoothly.

 I have verified that this data I'm trying to pass is being received
 properly in the kernel, by changing the kernel's return values to match
 each of the input variables - so I am at least getting data to the
 function!

 And I verified my comparison function binomialOptionFromProcessed against
 a known-good implementation of European Call binomial option pricing from
 the pyFinancials library - they produce identical results for the same
 number of steps, so I know I didn't munge that up.

 However, they are clearly not producing consistent results.

 So the two possibilities seem to be that I'm doing something wrong in the
 pyCuda portion of my code that is mucking up my data down at the bottom of
 the binomialOptionsGPU function, or that in tweaking the kernel from the
 SDK I have broken something in the kernel (either because I replaced the
 static arrays, or something else).

 I was hoping for some guidance on this, at least to help me rule out stupid
 mistakes with how I am invoking pyCUDA so I can better focus my debugging
 efforts.

 Thanks in advance for any help!

 Raefer Gabriel




 ___
 PyCuda mailing list
 PyCuda@tiker.net
 http://tiker.net/mailman/listinfo/pycuda_tiker.net




signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Fwd: pycuda on ubuntu 9.04?

2009-04-29 Thread Andreas Klöckner

On Mittwoch 29 April 2009, Brent Pedersen wrote:
 i dont have _internal.so maybe that's the prob, i have this:

Ah, sorry. _driver.so is correct. My bad.


 $ ldd
 /usr/lib/python2.5/site-packages/pycuda-0.92-py2.5-linux-x86_64.egg/pycuda/
_driver.so linux-vdso.so.1 =  (0x7fffdc3fe000)
libboost_python-gcc42-mt-1_34_1-py26.so.1.34.1 =
 /usr/lib/libboost_python-gcc42-mt-1_34_1-py26.so.1.34.1
 (0x7ff8d3e4e000)
libcuda.so.1 = /usr/lib/libcuda.so.1 (0x7ff8d39ad000)
libstdc++.so.6 = /usr/lib/libstdc++.so.6 (0x7ff8d369f000)
libm.so.6 = /lib/libm.so.6 (0x7ff8d341a000)
libgcc_s.so.1 = /lib/libgcc_s.so.1 (0x7ff8d3202000)
libpthread.so.0 = /lib/libpthread.so.0 (0x7ff8d2fe5000)
libc.so.6 = /lib/libc.so.6 (0x7ff8d2c73000)
libutil.so.1 = /lib/libutil.so.1 (0x7ff8d2a7)
libdl.so.2 = /lib/libdl.so.2 (0x7ff8d286b000)
librt.so.1 = /lib/librt.so.1 (0x7ff8d2663000)
libz.so.1 = /lib/libz.so.1 (0x7ff8d244b000)
/lib64/ld-linux-x86-64.so.2 (0x7ff8d4367000)

Here we go. Your Boost.Python library is compiled (and linked!) against Python 
2.6 (see how it says py2.6?). It apparently doesn't like being used within a 
Python 2.5 interpreter. I'm honestly surprised you just got a spurious error--
it could've easily been a segfault.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Complete recipe for PyCUDA?

2009-05-08 Thread Andreas Klöckner

On Freitag 08 Mai 2009, Andrew Wagner wrote:
 Hello-

 Is there a recipe out there for getting PyCUDA (including
 dependencies!) up and running that is known to just work?

 I'm hopeful that if I can get PyCUDA installed properly, it will
 become my primary research tool.  I used MATLAB exclusively for years
 and loved it.  Once that became impossible for performance reasons I
 switched to C++ and CUDA.  Now my code runs a lot faster, but I'm only
 about 1/20th as productive as I was with MATLAB.  Since I'm hoping
 this will be my primary research tool,  I'm willing to do a clean
 install of any OS (though I prefer anything but windows, and have a
 slight preference for mac or debian).

 Even if there is no well-tested recipe, I'd appreciate guidance on
 what platforms, package managers, library versions, etc... are best
 supported.  I already spent a couple days tinkering with this, and I
 plan to devote up to a week more, starting in about one week.  I've
 been using cuda on OS X for about a semester, but I'm pretty new to
 python.

Using

http://documen.tician.de/pycuda/install.html

on Debian Lenny or Ubuntu 8.10 should just work, as in, copy and paste the 
commands, wait, and you're done. I use Debian myself.

On that note, any news with respect Ubuntu 9.04? Is that working ok now? Or 
are there still issues?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Complete recipe for PyCUDA?

2009-05-08 Thread Andreas Klöckner

On Freitag 08 Mai 2009, Andrew Wagner wrote:
 Thanks!  I pulled out my second graphics card and now I'm trying a
 fresh install of Debian Lenny.

 What version of CUDA do you recommend, and how did you install it?
 The installation directions say you developed against CUDA 2.0 beta;
 should I go with that?

Oh dear no! 2.2 (which was released yesterday) should work, even though I 
haven't tried it. 2.1 is pretty solid in any case.

Btw: I'll add the new stuff in 2.2 soon and then I'll finally start the .93 
release candidate process.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Call for Testers: 0.93rc1

2009-05-10 Thread Andreas Klöckner

On Sonntag 10 Mai 2009, Vincent Favre-Nicolin wrote:
 On dimanche 10 mai 2009, Andreas Klöckner wrote:
  Try the command
 
  $ ldd /usr/local/lib/python2.6/dist-packages/pycuda-0.93rc1-py2.6-linux-
  x86_64.egg/pycuda/_driver.so
 
  and see if you're inadvertently linking against a CUDA 2.1 library. That
  symbol *should* be there in 2.2.

I have only one cuda version installed.
 However it may be that I don't have the correct nVidia module for 2.2 -
 right now I'm using 180.44 - I guess I have to switch to the new version,
 i.e. 185.18.08 'beta'
   As 180.44 worked all right with pycuda 0.93 beta I assumed this was all
 right but I was probably wrong then - I assume only the new driver
 libcuda.so has the cuMemHostAlloc symbol. Sorry for the noise.

Thanks (for your answer and for pycuda - great tool !),
   Vincent

0.93 works against older versions of CUDA, too. (down to 1.1) In your case, 
your compile picked up the *header* file for 2.2, but only the driver library 
for 2.1. That's why you saw the mismatch. (Part of the problem is that cuda.h 
gets distributed with both the compiler *and* the driver.) Having non-matching 
versions of compiler and driver always leads to fun. :)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Passing struct with arrays into kernel

2009-05-10 Thread Andreas Klöckner

You're a bit confused about whether you're passing in a *pointer* to an array 
or the actual array data. The C struct says pointer to, your packing code says 
inlined array.

I'd suggest checking out numpy record arrays.

Andreas

On Sonntag 10 Mai 2009, Per B. Sederberg wrote:
 Hi Folks:

 I'm working on simulating a simple neural network model on the GPU.
 In my case, I should see benefits from performing many simulations of
 the simple model at once across threads instead of parallelizing
 individual simulations because the neural network is so small.

 I'd like to pass a struct with arrays containing parameters and
 initialization information for the neural network and also a place to
 put results.  This is only to keep the code clean (otherwise I'll be
 passing in handfuls of parameters to the kernel.)  I have had full
 success passing in separate parameters, but have failed to pass the
 struct, getting launch failed errors at various stages of the process
 (sometimes when allocating memory and sometimes with trying to read it
 off the device.)

 I've included a simplified example below.  I realize the class to
 handle talking to the C struct is a bit crazy, but if it worked I
 could clean it up into a more general class.

 Is there any clue as to what is wrong or is there a better way to
 accomplish what I'm trying to do?  I'm pretty new to pycuda and cuda,
 so I won't be offended at all if you give me drastically different
 suggestions of what to do or if you point out a ridiculous error that
 I'm making ;)

 Thanks,
 Per

 PS- I'm using a git clone of pycuda from about a week ago and version
 2.1 of CUDA libs on a GTX285.

 struct_test.py (also attached, but in case no attachments are allowed):
 --

 import pycuda.driver as cuda
 import pycuda.autoinit
 from pycuda.compiler import SourceModule

 import numpy as np

 mod = SourceModule(
 

 struct results
 {
   unsigned int n;
   float *A;
   float *B;
 };

 __global__ void struct_test(results *res)
 {
   unsigned int i;
   for (i=0; ires-n; i++)
   {
 res-A[i] = res-B[i] + 1;
   }
 }

 )


 cu_struct = mod.get_function(struct_test)

 class Results(object):
 def __init__(self, n=10):
 self._cptr = None
 self.n = np.uint32(n)
 self.A = np.zeros(self.n,dtype=np.float32)
 self.B = np.ones(self.n,dtype=np.float32)
 def send_to_gpu(self):
 if self._cptr is None:
 self._cptr = cuda.mem_alloc(self.nbytes())
 cuda.memcpy_htod(self._cptr, self.pack())
 def get_from_gpu(self):
 if not self._cptr is None:
 tempstr = np.array([' ']*self.nbytes())
 cuda.memcpy_dtoh(tempstr,self._cptr)
 ind = np.array([0,self.n.nbytes])
 self.n = np.fromstring(tempstr[ind[0]:ind[1]],
   
 dtype=self.n.dtype).reshape(self.n.shape) ind[0] += self.n.nbytes
 ind[1] += self.A.nbytes
 self.A = np.fromstring(tempstr[ind[0]:ind[1]],
   
 dtype=self.A.dtype).reshape(self.A.dtype) ind[0] += self.A.nbytes
 ind[1] += self.B.nbytes
 self.B = np.fromstring(tempstr[ind[0]:ind[1]],
   
 dtype=self.B.dtype).reshape(self.B.dtype) def pack(self):
 return self.n.tostring() + self.A.tostring() + self.B.tostring()
 def nbytes(self):
 return self.n.nbytes + self.A.nbytes + self.B.nbytes

 res = Results(10)
 res.send_to_gpu()
 cu_struct(res._cptr, block=(1,1,1))
 res.get_from_gpu()

 print res.A
 print res.B
 print res.n




signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Complete recipe for PyCUDA?

2009-05-15 Thread Andreas Klöckner

Hi Andrew,

glad things worked out for you! Please send support requests to the list in 
the future, though. That'll get you replies sooner, and make sure that there's 
a record of what you did. I've cc'ed the list on this reply.

Andreas

On Freitag 15 Mai 2009, you wrote:
 I managed to get pycuda 0.92 working on Ubuntu 8.10 amd64 on a Mac Pro.
 I used the 2.2 cuda drivers from Nvidia, and everything else from apt.
 Here is my siteconf.py:

 BOOST_INC_DIR = []
 BOOST_LIB_DIR = []
 BOOST_PYTHON_LIBNAME = ['boost_python-gcc42-mt-d-1_34_1-py25']
 CUDA_TRACE = False
 CUDADRV_LIB_DIR = []
 CUDADRV_LIBNAME = ['cuda']
 CXXFLAGS = []
 LDFLAGS = []

 Thanks Andreas!  Now the real fun begins!

 On Fri, May 8, 2009 at 4:42 PM, Andreas Klöckner

 li...@informa.tiker.net wrote:
  On Freitag 08 Mai 2009, Andrew Wagner wrote:
  Thanks!  I pulled out my second graphics card and now I'm trying a
  fresh install of Debian Lenny.
 
  What version of CUDA do you recommend, and how did you install it?
  The installation directions say you developed against CUDA 2.0 beta;
  should I go with that?
 
  Oh dear no! 2.2 (which was released yesterday) should work, even though I
  haven't tried it. 2.1 is pretty solid in any case.
 
  Btw: I'll add the new stuff in 2.2 soon and then I'll finally start the
  .93 release candidate process.
 
  Andreas




signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] non-2.2 compile errors

2009-05-15 Thread Andreas Klöckner

Hi Nicholas,

On Freitag 15 Mai 2009, you wrote:
 I think the CU_DEVICE_ATTRIBUTE_COMPUTE_MODE is new in 2.2 so it
 shouldn't be blindly included (also CU_COMPUTEMODE_DEFAULT,
 CU_COMPUTEMODE_EXCLUSIVE, CU_COMPUTEMODE_PROHIBITED).

I don't think they are--these here are the only hits for git grep 
CU_COMPUTEMODE:

#if CUDA_VERSION = 2020
  py::enum_CUcomputemode(compute_mode)
.value(DEFAULT, CU_COMPUTEMODE_DEFAULT)
.value(EXCLUSIVE, CU_COMPUTEMODE_EXCLUSIVE)
.value(PROHIBITED, CU_COMPUTEMODE_PROHIBITED)
;
#endif

Also, please go through the list.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] rc2 compile fails in wrap_cudadrv.cpp on OS X

2009-05-17 Thread Andreas Klöckner

Hi Bill,

sorry for the delay. What you reported is indeed a bug. I relied on a non-
standard member of std::vector in some code that was newly added. Should be 
fixed in git. Please test and confirm it's fixed.

Thanks,
Andreas

On Freitag 15 Mai 2009, Bill Punch wrote:
 I installed the CUDA 2.2 drivers yesterday and downloaded both the rc2
 and the latest git version. Compilation dies in the same place, line
 202, with the error stating there is no member 'data' for options (cast
 to a CUjit_option). In looking at cuda.h, where CUjit_option is defined,
 I do not see a data member, but that was just a quick look. Any ideas
 what is wrong? siteconf.py and full error below.

 The error:
 

 python setup.py build

 running build
 running build_py
 running build_ext
 building '_driver' extension
 gcc -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing
 -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic
 -O3 -DNDEBUG -Isrc/cpp -I/usr/local/include/boost-1_37
 -I/usr/local/cuda/include
 -I/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pack
ages/numpy/core/include
 -I/Library/Frameworks/Python.framework/Versions/2.5/include/python2.5 -c
 src/wrapper/wrap_cudadrv.cpp -o
 build/temp.macosx-10.3-i386-2.5/src/wrapper/wrap_cudadrv.o -arch i386
 src/wrapper/wrap_cudadrv.cpp: In function
 ‘cuda::module*unnamed::module_from_buffer(boost::python::api::object,
 boost::python::api::object, boost::python::api::object)’:
 src/wrapper/wrap_cudadrv.cpp:202: error: ‘class
 std::vectorCUjit_option, std::allocatorCUjit_option ’ has no member
 named ‘data’
 src/wrapper/wrap_cudadrv.cpp:203: error: ‘class std::vectorvoid*,
 std::allocatorvoid* ’ has no member named ‘data’
 error: command 'gcc' failed with exit status 1

 siteconf.py
 -
 BOOST_INC_DIR = ['/usr/local/include/boost-1_37']
 BOOST_LIB_DIR = ['/usr/local/lib']
 BOOST_COMPILER = 'gcc43'
 BOOST_PYTHON_LIBNAME = ['boost_python-xgcc40-mt-1_37']
 BOOST_THREAD_LIBNAME = ['boost_thread-xgcc40-mt']
 CUDA_TRACE = False
 CUDA_ENABLE_GL = False
 CUDADRV_LIB_DIR = ['/usr/local/cuda/lib']
 CUDADRV_LIBNAME = ['cuda']
 CUDA_ROOT = '/usr/local/cuda'
 CXXFLAGS = ['-arch', 'i386']
 LDFLAGS = ['-arch', 'i386']




signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Why are Function.lmem, Function.sme m, and Function.registers deprecated?

2009-05-19 Thread Andreas Klöckner

On Dienstag 19 Mai 2009, Bryan Catanzaro wrote:
 I was browsing the documentation and saw the note that
 pycuda.driver.Function.registers, etc. are deprecated and will be
 removed in PyCuda 0.94.  That makes me a little sad, as that
 information is very useful to one of the projects I'm working on.  The
 implication in the documentation is that this is Cuda 2.2's fault.
 But I'm a little confused as to why - the .cubin files produced by my
 nvcc 2.2 compiler still have that information, so I must be missing
 something important here...  What changed to make these useful
 attributes deprecated?

No, it's different--things are getting better, not worse! :)

CUDA 2.2 introduces an official API to find these values:
http://is.gd/Bscw

So, if you're running 0.93 against 2.2, you'll get a deprecation warning for 
using .registers etc. The goal is to keep with the Zen of Python:

There should be one-- and preferably only one --obvious way to do it.

Since CUDA 2.2 brought us a second way, the first one gets deprecated. If this 
meets enough resistance, I guess I could be convinced to keep .registers et al 
around. I'd rather not though. On 2.2, all .registers does now is call the new 
API anyway.

I've added a note to the docs about what the new way is.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] OS X 10.5.7

2009-05-19 Thread Andreas Klöckner

On Mittwoch 20 Mai 2009, Eli Bressert wrote:
 CUDA_ROOT = '/user/local/cuda'
   ^^ Typo?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] problem making pycuda 0.92

2009-05-20 Thread Andreas Klöckner

Try adding

CXXFLAGS = ['-DBOOST_PYTHON_NO_PY_SIGNATURES']

to your siteconf.py. That may help work around the ICE.

Andreas

On Mittwoch 20 Mai 2009, Hua Wong wrote:
 I think Boost was well installed (no skip at all so...)
 Cuda toolkit is installed
 I don't know if I am doing something wrong in the configuration phase...

 $ python configure.py
 --boost-inc-dir=/work/hwong/apps/include/boost-1_39/
 --boost-lib-dir=/work/hwong/apps/lib/
 --boost-python-libname=boost_python-gcc42-mt --cuda-root=/usr/local/cuda/
 $ make

 ctags -R src || true
 /usr/bin/python setup.py build
 running build
 running build_py
 running build_ext
 building '_driver' extension
 gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
 -mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp
 -I/work/hwong/apps/include/boost-1_39/ -I/usr/local/cuda/include
 -I/work/hwong/site-packages/numpy-1.3.0-py2.4-linux-x86_64.egg/numpy/core/i
nclude -I/usr/include/python2.4 -c src/wrapper/mempool.cpp -o
 build/temp.linux-x86_64-2.4/src/wrapper/mempool.o
 src/cpp/mempool.hpp: In instantiation of
 ‘pycuda::memory_poolunnamed::device_allocator’:
 src/wrapper/mempool.cpp:75: instantiated from
 ‘unnamed::context_dependent_memory_poolunnamed::device_allocator’
 src/cpp/mempool.hpp:284: instantiated from
 ‘pycuda::pooled_allocationunnamed::context_dependent_memory_poolunname
d::device_allocator

  ’

 src/wrapper/mempool.cpp:90: instantiated from here
 src/cpp/mempool.hpp:48: warning: ‘class
 pycuda::memory_poolunnamed::device_allocator’ has virtual functions
 but non-virtual destructor
 src/wrapper/mempool.cpp: In instantiation of
 ‘unnamed::context_dependent_memory_poolunnamed::device_allocator’:
 src/cpp/mempool.hpp:284: instantiated from
 ‘pycuda::pooled_allocationunnamed::context_dependent_memory_poolunname
d::device_allocator

  ’

 src/wrapper/mempool.cpp:90: instantiated from here
 src/wrapper/mempool.cpp:75: warning:
 ‘classunnamed::context_dependent_memory_poolunnamed::device_allocator
’ has virtual functions but non-virtual destructor
 src/cpp/mempool.hpp: In instantiation of
 ‘pycuda::memory_poolunnamed::host_allocator’:
 src/cpp/mempool.hpp:284: instantiated from
 ‘pycuda::pooled_allocationpycuda::memory_poolunnamed::host_allocator
 ’ src/wrapper/mempool.cpp:128: instantiated from here
 src/cpp/mempool.hpp:48: warning: ‘class
 pycuda::memory_poolunnamed::host_allocator’ has virtual functions
 but non-virtual destructor
 /work/hwong/apps/include/boost-1_39/boost/type_traits/is_polymorphic.hpp:
 In instantiation of
 ‘boost::detail::is_polymorphic_imp1unnamed::context_dependent_memory_poo
lunnamed::device_allocator

  ::d1’:

 /work/hwong/apps/include/boost-1_39/boost/type_traits/is_polymorphic.hpp:62
: instantiated from ‘const bool
 boost::detail::is_polymorphic_imp1unnamed::context_dependent_memory_pool
unnamed::device_allocator

  ::value’

 /work/hwong/apps/include/boost-1_39/boost/type_traits/is_polymorphic.hpp:97
: instantiated from ‘const bool
 boost::detail::is_polymorphic_impunnamed::context_dependent_memory_pool
unnamed::device_allocator

  ::value’

 /work/hwong/apps/include/boost-1_39/boost/type_traits/is_polymorphic.hpp:10
2: instantiated from
 ‘boost::is_polymorphicunnamed::context_dependent_memory_poolunnamed::
device_allocator

  ’

 /work/hwong/apps/include/boost-1_39/boost/mpl/if.hpp:67: instantiated
 from
 ‘boost::mpl::if_boost::is_polymorphicunnamed::context_dependent_memory_
poolunnamed::device_allocator

  ,

 boost::python::objects::polymorphic_id_generatorunnamed::context_depende
nt_memory_poolunnamed::device_allocator

  ,

 boost::python::objects::non_polymorphic_id_generatorunnamed::context_dep
endent_memory_poolunnamed::device_allocator

   ’

 /work/hwong/apps/include/boost-1_39/boost/python/object/inheritance.hpp:65:
 instantiated from
 ‘boost::python::objects::dynamic_id_generatorunnamed::context_dependent_
memory_poolunnamed::device_allocator

  ’

 /work/hwong/apps/include/boost-1_39/boost/python/object/inheritance.hpp:72:
 instantiated from ‘void boost::python::objects::register_dynamic_id(T*)
 [with T =
 unnamed::context_dependent_memory_poolunnamed::device_allocator]’
 /work/hwong/apps/include/boost-1_39/boost/python/object/class_metadata.hpp:
97: instantiated from ‘void
 boost::python::objects::register_shared_ptr_from_python_and_casts(T*,
 Bases) [with T =
 unnamed::context_dependent_memory_poolunnamed::device_allocator,
 Bases = boost::python::basesmpl_::void_, mpl_::void_, mpl_::void_,
 mpl_::void_, mpl_::void_, mpl_::void_, mpl_::void_, mpl_::void_,
 mpl_::void_, mpl_::void_]’
 /work/hwong/apps/include/boost-1_39/boost/python/object/class_metadata.hpp:
225: instantiated from ‘static void
 boost::python::objects::class_metadataT, X1, X2, X3::register_aux2(T2*,
 Callback) [with T2 =
 unnamed::context_dependent_memory_poolunnamed::device_allocator,
 Callback = boost::integral_constantbool, false, T =

Re: [PyCuda] PyCUDA and OpenCL integration?

2009-05-21 Thread Andreas Klöckner

On Donnerstag 21 Mai 2009, Paul Rigor (uci) wrote:
 Hi Andreas,
 Do you foresee PyCUDA extending support to the OpenCL standard?

It would probably happen as a separate package, but the thought of having 
something for CL in the same spirit as PyCUDA has certainly crossed my mind. I 
have a little bit of code, too:

http://git.tiker.net/pyopencl.git
git clone http://git.tiker.net/trees/pyopencl.git

Help is more than welcome. :)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] error

2009-05-23 Thread Andreas Klöckner

On Samstag 23 Mai 2009, Eli Bressert wrote:
 Hi,

 I was trying to use the cumath functions and ran into an error. I'm on
 OS X 10.5.7 with a GeForce 8800GT graphics card and using
 PyCUDA-0.93rc2. Am I doing something wrong or is there a bug? I did
 not have this issue with PyCUDA -0.92.

A bug. Should be fixed in recent git master and release-0.93.

Thanks for the report,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] pycuda.gpuarray.sum() method

2009-05-23 Thread Andreas Klöckner

On Samstag 23 Mai 2009, Thomas Robitaille wrote:
 Hello,

 I am trying to sum a GPU array along one axis only. This can be done
 in numpy:

 a = np.random.random((100,200))
 b = np.sum(a,axis=1)

 However, there is no axis= keyword for the pycuda.gpuarray.sum()
 method. Would this be simple to implement? If not, is there an easy
 way to write a kernel function to do this?

GPUArrays are still somewhat painfully unaware of the fact that they're 
actually multi-dimensional beings--I'm not sure they even store information 
about multi-axes strides, leading to rather interesting results from 
to_gpu(ary).get() if ary is not in C order. This is part of it is not 
difficult to fix. Just add a few tests and make sure they pass.

For the single-axis sum, you can certainly use pycuda/pycuda/reduction.py as a 
pattern, but the partial reduction you're requesting is rather more 
complicated than the full reduction implemented there. Good luck, and I think 
many people here would be grateful for any code you might be able to share.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] FW: Why are Function.lmem, Function .smem, and Function.registers deprecated?

2009-05-24 Thread Andreas Klöckner

On Mittwoch 20 Mai 2009, Bryan Catanzaro wrote:
 Would it be possible to change both the Device.get_attribute() and the
 Function.get_attribute() calls to make them Python attributes instead
 of using function-and-flag? It seems like this could avoid the
 inconsistency between the two.

 I agree with Ian - aesthetically it doesn't seem optimal to echo the C
 API exactly in PyCUDA, when Python gives us much richer possibilities.

Here's something that somehow feels like a second-best solution, but it at 
least maintains consistency: Everything that can be accessed as

  something.get_attribute(some.scope.ATTR_NAME)

can now *also* be accessed as

  something.attr_name.

I don't think I want to kill the get_attribute interface entirely. This would 
still mean that

  Function.registers

would be deprecated, in favor of

  Function.num_regs

The point is that the name of the attribute and the name of the magic number 
would be consistent with each other and therefore could be implemented 
automatically--with no further maintenance burden.

Opinions on this one?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Where do I define the --prefix installation option?

2009-05-26 Thread Andreas Klöckner

On Dienstag 26 Mai 2009, Hua Wong wrote:
 Sorry to bother again, I managed to pass the 'make' phase but I am now
 stuck in the 'make install' one.

 I guess it is using setup.py to install all the files that are made
 during the Make process.

 But I doesn't have the root privilege. And when I type python setup.py
 --help, it marks that --prefix option is Ignored.

 Where do I precise a custom directory for installation?

Also check out Ian Bicking's virtualenv. That's been a great help ever since I 
discovered it.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCuda] 0.93rc3

2009-05-30 Thread Andreas Klöckner

Hi all,

Since there were a few issues reported with 0.93rc2, I've just rolled 0.93rc3. 
Here's to hoping this version will also be 0.93. To make sure we get a solid 
0.93 out, please test:

  http://pypi.python.org/pypi/pycuda/0.93rc3

Thanks for all your dedication in reporting issues in these release 
candidates. Rock on. :)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] FW: Why are Function.lmem, Functio n.smem, and Function.registers deprecated?

2009-05-30 Thread Andreas Klöckner

On Sonntag 24 Mai 2009, Andreas Klöckner wrote:
 Here's something that somehow feels like a second-best solution, but it at
 least maintains consistency: Everything that can be accessed as

   something.get_attribute(some.scope.ATTR_NAME)

 can now *also* be accessed as

   something.attr_name.

This behavior is now in 0.93rc3, the 0.93 release branch, and master.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Ongoing install puzzle...

2009-05-31 Thread Andreas Klöckner

On Sonntag 31 Mai 2009, Vince Fulco wrote:
 get the error, ImportError:
 /usr/lib64/python2.5/site-packages/pycuda-0.93rc3-py2.5-linux-x86_64.egg/py
cuda/_driver.so: undefined symbol: cuMemHostAlloc.

This means you're picking up a libcuda.so from CUDA 2.1. Try running

$ ldd /usr/lib64/python2.5/site-packages/pycuda-0.93rc3-py2.5-linux-
x86_64.egg/pycuda/_driver.so

and make sure that the libcuda.so that gets linked in is the one you're 
expecting. (Probably not.) The problem is that Nvidia cleverly decided to 
distribute libcuda.so (the low-level driver interface) with both the toolkit 
and the driver. Ideally, both should match, so it doesn't matter. In your 
case, it seems you're picking up a stale one--maybe you haven't upgraded your 
GPU driver to 2.2?

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCUDA on Windows XP: failed with exit status 2

2009-06-02 Thread Andreas Klöckner

On Dienstag 02 Juni 2009, Marcel Krause wrote:
 Hello list,

 I try to install PyCUDA von WinXP SP 3. Most of the problems were
 solved with a little editing:
 * updating the DEFAULT_VERSION in ez_setup.py

This'll be unnecessary once 0.93 is out. Thanks for the reminder, though.

 * let setup.py search for nvcc.exe instead of nvcc

Good point. This is now done by default. That change will be in 0.93 and is 
also in git master.

 But now I'm stuck: setup.py install gives me vc.exe failed with
 exit status 2. Does anyone have any idea what that could mean?

From a distance, this looks like a compiler bug. In my book, a compiler should 
either succeed or fail with an error message. VC doesn't seem to do either in 
this instance. Any VC gurus here? What version of VC are you using?

 I post the other messages at the bottom of this mail, and also
 my siteconf.py. I'm not sure whether I have the right paths in
 there - can anyone tell me examples of files that I could search
 to make sure I have the right path configured, and find the paths
 that I don't yet have? (Like CUDADRV_LIB_DIR.)

You BOOST_PYTHON_LIBNAME is likely wrong--it refers to gcc, and you're 
compiling with msvc. But that's likely not the root of your current problem.

 configure.py runs without errors, but does not insert any paths
 at all into the siteconf.py, so I have to find them all manually.
 Not quite easy, because I have very lots of directories that are
 named lib, include and so on. :)

Hmm, configure.py was designed as just a commandline way of filling out 
siteconf.py--it wasn't meant to really detect anything. I realize that the 
name breeds expectations that it doesn't meet. I'll think of something. :)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCUDA on Windows XP: failed with exit status 2

2009-06-03 Thread Andreas Klöckner

On Mittwoch 03 Juni 2009, Marcel Krause wrote:
 Hello Andreas, hello list,

   You BOOST_PYTHON_LIBNAME is likely wrong--it refers to gcc,
   and you're compiling with msvc.

 thanks, I fixed it to boost_python-vc90-mt. There's only a file
 libboost_python-vc90-mt.lib in the path I have as BOOST_LIB_DIR,
 so is the lib added automatically before the filename?

It is on Unix. Not sure about Windows.

 MSVC did indeed give more error messages, it was my fault that I
 used a wrong setting to redirect its output to a file. Here's the
 full error report: http://paste.nn-d.de/914 It's in german since
 I use a german version of MSVC, but maybe it helps nonetheless.

Oh my, that's all in Boost. That's good news in principle, since it's all 
supposed to work--the Boost docs say it should. You're using VC 9, I take it?

http://www.boost.org/doc/libs/1_39_0/libs/python/doc/v2/platforms.html
(outdated, ick)

I think we're best advised trying to track down that syntax error first. Is 
there a way of having MSVC print the include path that got it to the point 
where an error occurred?

 For the siteconf.py, I'd appreciate a method to test its validity,
 e.g. try to find important files using the pathes therein, and if
 they are not found, warn me and tell me where it hat expected them
 so I can search for the file and correct the path. I could try and
 write such a verifier script if someone tells me which files are
 expected in which paths.

Oh, sweet. I'll take you up on that offer.

BOOST_INC_DIR/boost/python.hpp
BOOST_LIB_DIR/(lib)?BOOST_PYTHON_LIBNAME(.so|.dylib|?Windows?)
BOOST_LIB_DIR/(lib)?BOOST_THREAD_LIBNAME(.so|.dylib|?Windows?)
CUDA_ROOT/bin/nvcc(.exe)?
CUDA_INC_DIR/cuda.h
CUDADRV_LIB_DIR=(lib)?CUDADRV_LIBNAME(.so|.dylib|?Windows?)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] Installing on Fedora 10

2009-06-05 Thread Andreas Klöckner

On Freitag 05 Juni 2009, Chris Heuser wrote:
 The boost warning worked. But I am confused, since I have version 1.37 of
 boost installed. I will install 1.39, but 1.37 should work, shouldn't it?

My guess here is that you installed 1.37 manually, and you also have an old 
set (maybe 1.34) installed through Fedora's package management. Apparently, 
the system-wide set gets picked up instead of the manually-compiled one.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] diskdict + sqlite problem?

2009-06-08 Thread Andreas Klöckner

On Montag 08 Juni 2009, Adlar Kim wrote:
 Hi,

 After pycuda installation I ran some examples provided. Every time I
 run them, I get the following message:

 /usr/lib/python2.4/site-packages/pytools-9-py2.4.egg/pytools/
 diskdict.py:119: UserWarning: DiskDict will memory-only: a usable
 version of sqlite was not found.
   warnings.warn(DiskDict will memory-only: 

 Any ideas what is wrong here? Thanks in advance.

You can safely disregard those--or upgrade to 0.93rc3:
http://pypi.python.org/pypi/pycuda/0.93rc3

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] pycuda-0.93rc3 build problem

2009-06-09 Thread Andreas Klöckner

On Dienstag 09 Juni 2009, Adlar Kim wrote:
 Hi,

 I'm getting the following error when I try to build pycuda-0.93rc.
 Could anybody help? Thanks.

I think I fixed [1] this a couple days ago. I just rolled 0.93rc4, which you 
can get at [2]. AFAICT, your problem is specific to CUDA 2.1. Please try the 
RC and report back.

Sorry all for the barrage of release candidates... 

Andreas

[1] 
http://git.tiker.net/pycuda.git/commitdiff/b98a3a3598da370d316788b48ba5a69607907f4b
[2] http://pypi.python.org/pypi/pycuda/0.93rc4


signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCuda] PyCUDA on mac os 10.5.6

2009-06-12 Thread Andreas Klöckner

Hi Massimo,

I'm mostly clueless about Macs, but I did notice that you built PyCUDA for the 
32-bit ABI. Maybe Boost got built against the 64-bit one?

 CXXFLAGS = ['-arch','i386']
 LDFLAGS = ['-arch','i386']

(Btw, my suspicion is that these shouldn't be needed because PyCUDA (or rather 
distutils) automatically picks up all the switches that were used to build 
Python, anyway.)

On Linux, similar issues are often caused by clashes with system-wide Boost 
libraries.

:-? Anyone from the Mac crowd have an idea?

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCuda mailing list
PyCuda@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] PyCuda Problem

2009-06-12 Thread Andreas Klöckner

First off, if you have support questions, please go via the PyCUDA mailing 
list. I've CC'ed them on this reply.

On Freitag 12 Juni 2009, you wrote:
 Hello Andreas, I'm Luiz from Brazil.

BOOST_COMPILER was introduced in the 0.93 release cycle--the instructions 
you're following do not match the version you're using, which is 0.92. I'd 
suggest you try version 0.93rc4, from here:

http://pypi.python.org/pypi/pycuda/0.93rc4

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [PyCuda] PyCUDA on mac os 10.5.6

2009-06-13 Thread Andreas Klöckner

 
  In the siteconfig.py I used the 1_39 one.  I guess I could try to use
  the other one. I also have 2 version of Python 2.5 and 2.6. However, the
  current one is 2.5 which is the only that worked with Numpy.
 
  Also my $DYLD_LIBRARY_PATH has:
  /usr/local/cuda/lib:/usr/local/lib:
 
  Thanks,
  Massimo
 
  On Fri, Jun 12, 2009 at 7:45 AM, Randy Heiland 
heil...@indiana.eduwrote:
  I leave those blank and things work fine for me (OSX 10.5).
 
  -Randy
 
  On Jun 12, 2009, at 8:27 AM, Andreas Klöckner wrote:
 
   Hi Massimo,
 
  I'm mostly clueless about Macs, but I did notice that you built PyCUDA
  for the
  32-bit ABI. Maybe Boost got built against the 64-bit one?
 
   CXXFLAGS = ['-arch','i386']
 
  LDFLAGS = ['-arch','i386']
 
  (Btw, my suspicion is that these shouldn't be needed because PyCUDA
  (or rather
  distutils) automatically picks up all the switches that were used to
  build
  Python, anyway.)
 
  On Linux, similar issues are often caused by clashes with system-wide
  Boost
  libraries.
 
  :-? Anyone from the Mac crowd have an idea?
 
  Andreas
 
  signature.ascATT1.txt
 
  ___
  PyCuda mailing list
  PyCuda@tiker.net
  http://tiker.net/mailman/listinfo/pycuda_tiker.net
 
  ATT1.txt
 
  ATT1.txt




signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

2009-06-14 Thread Andreas Klöckner

On Samstag 13 Juni 2009, Andrew Wagner wrote:
 Thanks again, Nicolas!  I suspected that was the case...

 If this is still the case in the latest pycuda, the documentation for
 get_global() at

 http://documen.tician.de/pycuda/driver.html#code-on-the-device-modules-and-
functions

 needs to be corrected.

Fixed.


 I have attached an short demo for using constant memory that does
 this, and seems to work at least for one easy case.  I release it
 under whatever license the other pycuda demo scripts are under, so it
 can be included in the next release if Andreas sees fit.

 Unfortunately, this means something harder to debug is causing my
 garbage results...

Now in test_driver.

Thanks!
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [PyCuda] pycuda-0.93rc3 build problem

2009-06-15 Thread Andreas Klöckner

On Dienstag 09 Juni 2009, Adlar Kim wrote:
 Hi,

 I'm getting the following error when I try to build pycuda-0.93rc.
 Could anybody help? Thanks.

Did 0.93rc4 work for you? If so, that would be one data point to help me 
figure out if we're ready for 0.93.

Thanks,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [PyCuda] pycuda-0.93rc3 build problem

2009-06-15 Thread Andreas Klöckner

I can't quite make much of the error you're getting. In what I'm seeing seems 
that gcc is complaining about a function a-circumflex near some macro 
expansions.

Here are some suggestions nonetheless, based on the other bits of output:

- Definitely shave the last boost off your BOOST_INC_DIR.

- It's strange that the boost headers would be in /usr/local, but the matching 
libraries would be in /usr/lib64.

- You appear to be using gcc 4.1. This here may apply to you: 
http://is.gd/12CmW

Andreas

On Montag 15 Juni 2009, Adlar Kim wrote:
 no, it did not. here's what I'm getting:


 siteconf.py
 ===

 BOOST_INC_DIR = ['/usr/local/include/boost-1_39/boost']
 BOOST_LIB_DIR = ['/usr/lib64']
 BOOST_COMPILER = 'gcc41'
 BOOST_PYTHON_LIBNAME = ['boost_python-gcc41-mt']
 BOOST_THREAD_LIBNAME = ['boost_thread-gcc41-mt']
 CUDA_TRACE = False
 CUDA_ENABLE_GL = False
 CUDADRV_LIB_DIR = ['/usr/local/cuda/lib']
 CUDADRV_LIBNAME = ['cuda']
 CXXFLAGS = []
 LDFLAGS = []


 Error
 =

 *** Cannot find Boost headers. Checked locations:
/usr/local/include/boost-1_39/boost/boost/python.hpp
 *** Cannot find Boost Python library. Checked locations:
/usr/lib64/libboost_python-gcc41-mt.so
/usr/lib64/libboost_python-gcc41-mt.dylib
/usr/lib64/libboost_python-gcc41-mt.lib
/usr/lib64/boost_python-gcc41-mt.so
/usr/lib64/boost_python-gcc41-mt.dylib
/usr/lib64/boost_python-gcc41-mt.lib
 *** Cannot find Boost Thread library. Checked locations:
/usr/lib64/libboost_thread-gcc41-mt.so
/usr/lib64/libboost_thread-gcc41-mt.dylib
/usr/lib64/libboost_thread-gcc41-mt.lib
/usr/lib64/boost_thread-gcc41-mt.so
/usr/lib64/boost_thread-gcc41-mt.dylib
/usr/lib64/boost_thread-gcc41-mt.lib
/usr/lib64/boost_thread-gcc41-mt.so
/usr/lib64/boost_thread-gcc41-mt.dylib
/usr/lib64/boost_thread-gcc41-mt.lib
 *** Cannot find CUDA driver library. Checked locations:
/usr/local/cuda/lib/libcuda.so
/usr/local/cuda/lib/libcuda.dylib
/usr/local/cuda/lib/libcuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
 *** Note that this may not be a problem as this component is often
 installed system-wide.
 running build
 running build_py
 running build_ext
 building '_driver' extension
 creating build/temp.linux-x86_64-2.4
 creating build/temp.linux-x86_64-2.4/src
 creating build/temp.linux-x86_64-2.4/src/cpp
 creating build/temp.linux-x86_64-2.4/src/wrapper
 gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -
 fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -
 mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp -I/usr/
 local/include/boost-1_39/boost -I/usr/local/cuda/include -I/usr/lib/
 python2.4/site-packages/numpy-1.3.0-py2.4-linux-x86_64.egg/numpy/core/
 include -I/usr/include/python2.4 -c src/cpp/cuda.cpp -o build/
 temp.linux-x86_64-2.4/src/cpp/cuda.o
 gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -
 fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -
 mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp -I/usr/
 local/include/boost-1_39/boost -I/usr/local/cuda/include -I/usr/lib/
 python2.4/site-packages/numpy-1.3.0-py2.4-linux-x86_64.egg/numpy/core/
 include -I/usr/include/python2.4 -c src/cpp/bitlog.cpp -o build/
 temp.linux-x86_64-2.4/src/cpp/bitlog.o
 gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -
 fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -
 mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp -I/usr/
 local/include/boost-1_39/boost -I/usr/local/cuda/include -I/usr/lib/
 python2.4/site-packages/numpy-1.3.0-py2.4-linux-x86_64.egg/numpy/core/
 include -I/usr/include/python2.4 -c src/wrapper/wrap_cudadrv.cpp -o
 build/temp.linux-x86_64-2.4/src/wrapper/wrap_cudadrv.o
 src/wrapper/wrap_cudadrv.cpp: In function â:
 src/wrapper/wrap_cudadrv.cpp:165: error: ârnExâ was not declared in
 this scope
 src/wrapper/wrap_cudadrv.cpp: In function â:
 src/wrapper/wrap_cudadrv.cpp:166: error: ârnExâ was not declared in
 this scope
 src/wrapper/wrap_cudadrv.cpp: In function â:
 src/wrapper/wrap_cudadrv.cpp:167: error: ârnExâ was not declared in
 this scope
 src/wrapper/wrap_cudadrv.cpp: In function â:
 src/wrapper/wrap_cudadrv.cpp:310: error: ârnExâ was not declared in
 this scope
 src/wrapper/wrap_cudadrv.cpp: In function â:
 src/wrapper/wrap_cudadrv.cpp:357: error: ârnExâ was not declared in
 this scope
 error: command 'gcc' failed with exit status 1

 On Jun 15, 2009, at 12:13 PM, Andreas Klöckner wrote:
  On Dienstag 09 Juni 2009, Adlar Kim wrote:
  Hi,
 
  I'm getting the following error when I try to build pycuda-0.93rc.
  Could anybody help? Thanks.
 
  Did 0.93rc4 work for you? If so, that would

Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

2009-06-15 Thread Andreas Klöckner

On Montag 15 Juni 2009, you wrote:
 I've attached a slightly
 cleaned up, standalone version with NVIDIA's copyright notice
 restored.

I'm assuming you meant for this to end up in PyCUDA's examples folder. That's 
where it is now, in any case. :) Let me know if that wasn't the intenion.

Thanks for the contribution,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Buggy install on Mac OS X

2009-06-17 Thread Andreas Klöckner

On Donnerstag 18 Juni 2009, ashleigh baumgardner wrote:
 I did add the lib location to my .profile and that didn't fix all of my
 problems but it put me on the right track. At this point PyCUDA does
 compile, by which I mean make install does not give any errors.  But when
 I try to run the test programs it hangs.  It does not respond to control C
 so I think the problem is in a system call possibly in the driver code
 itself. Has anybody else experienced this? Is there anything I can check on
 or look for to see what is causing it?  I am using python2.5 and boost as
 installed by Mac ports on a MacBook OSX version 10.5.

What are you running when you get the hang? test_driver.py? In addition, 
PyCUDA has CUDA API tracing, enabled by CUDA_TRACE = True in your siteconf.py 
and recompiling (rm -Rf build; python setup.py install). This may help 
pinpoint what's at fault.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] PyCUDA on Ubuntu 8.10 Interpid

2009-06-18 Thread Andreas Klöckner

On Donnerstag 18 Juni 2009, Marcel Krause wrote:
 Hello Andreas, hello list,

   Nope--CUDA_ROOT's not a list. :)

 thanks, that did it. In my first try, I got

   /usr/bin/ld: cannot find -llibboost_python-gcc43-mt-1_39

 Then I removed the prefix lib from BOOST_PYTHON_LIBNAME and
 BOOST_THREAD_LIBNAME and tried again. Now there run many lines,

 and then:
   /usr/lib/python2.5/site-packages/pycuda-0.93rc4-py2.5-linux-
   i686.egg/pycuda/tools.py.BACKUP.23020.py, line 317
   HEAD:pycuda/tools.py
   ^
   SyntaxError: invalid syntax

 which seems like an error, but it finishes with

Those are leftovers from a (failed?) git merge, it appears. While I think it 
did get copied to your install directory, it's entirely harmless.

 which sounds a bit like success.

It is. :)

 Also, since the second try, I am unable to reproduce the first
 message about /usr/bin/ld.

The one that you fixed by removing lib? Or which other message are you 
referring to?

 Also, the setup script could warn me about the lib prefix on
 platforms where it is unusual to use it. (Maybe it is unusual
 on all platforms - I don't know.)

Don't know of one where it is usual. (Anybody?) Would you mind writing a patch 
to setup.py to do so?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] PyCUDA on Ubuntu 8.10 Interpid

2009-06-19 Thread Andreas Klöckner

On Freitag 19 Juni 2009, Marcel Krause wrote:
 Hello Andreas, hello list,

  Also, since the second try, I am unable to reproduce the first
  message about /usr/bin/ld.
 
  The one that you fixed by removing lib? Or which other message are
  you referring to?

 Yes, that one. I'll try and write a lib prefix warning patch.

Cool.

   ImportError: /usr/lib/python2.5/site-packages/pycuda-0.93rc4-py2.5-
   linux-i686.egg/pycuda/_driver.so: undefined symbol:
   _ZNK5boost6python6detail17exception_handlerclERKNS_
   9function0IvSaINS_13function_base

 The lines above ImportError: are the same as before.
 I ran setup.py install again, didn't change anything.
 I deleted the .egg directory mentioned in the ImportError and ran
 setup.py install (I tried to force it to recompile everything),
 still the same error. :(

http://wiki.tiker.net/PyCuda/FrequentlyAskedQuestions#How_do_I_make_PyCUDA_rebuild_itself_from_scratch.3F
http://wiki.tiker.net/PyCuda/FrequentlyAskedQuestions#When_I_run_my_first_code.2C_I_get_an_ImportError.3F

:) [I just added those]

(or, if the URLs got mangled, http://is.gd/16hfI and http://is.gd/16hb4.)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] casting arguments to memset to unsigned int ?

2009-07-01 Thread Andreas Klöckner

On Dienstag 30 Juni 2009, Michael Rule wrote:
 Ok, fixed that ( needed to use uint not uint32. Numpy basic types
 don't seem to be anywhere near the surface of Google and are hard to
 find ). I now have another related question :

 the documentation for the pycuda prepare function states
 setting up the argument types as arg_types. arg_types is expected to
 be an iterable containing type characters understood by the struct
 module or numpy.dtype objects.

 Since I am still baffled by numpy datatypes ( which definitely don't
 seem to include pointers), how do I tell CUDA that this is a pointer
 to a float and this is a pointer to an integer using this function
 ?

numpy.intp = pointer-sized integer. There's no checking of the pointed-to type 
as yet.

Would you mind adding FAQ items for these two?
http://wiki.tiker.net/PyCuda/FrequentlyAskedQuestions

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [PyCuda] pyCUDA Windows Installation

2009-07-02 Thread Andreas Klöckner

On Donnerstag 02 Juli 2009, faberx wrote:
 Dear all! You can install pycuda on windows xp! Please look at:
 http://wiki.tiker.net/PyCuda/Installation/Windows
 http://wiki.tiker.net/PyCuda/Installation/Windows

Thanks for writing this up!

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] casting arguments to memset to unsigned int ?

2009-07-03 Thread Andreas Klöckner

On Mittwoch 01 Juli 2009, Andreas Klöckner wrote:
 Would you mind adding FAQ items for these two?
 http://wiki.tiker.net/PyCuda/FrequentlyAskedQuestions

Thanks for writing the FAQ item! FYI--I've slightly reworked and expanded it.

http://is.gd/1mMDg

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] NVIDIA CUDA Visual Profiler?

2009-07-31 Thread Andreas Klöckner

On Freitag 31 Juli 2009, Ahmed Fasih wrote:
 Hi, I'm very surprised that google isn't turning up something about
 this topic because I thought it's been previously discussed, so my
 apologies if it has.

 I'm trying the NVIDIA CUDA Visual Profiler (v 2.2.05) in Windows XP
 with a fairly recent PyCUDA git, on CUDA 2.2
 (pycuda.driver.get_driver_version() returns 2020).

 I provide the Visual Profiler with a Windows batch file that calls
 python my_pycuda_script.py -some -flags, but the Visual Profiler
 (after running the script 4 times) only reports two methods,
 memcopy. All other counters are zero (so it doesn't display them in
 the table). Manipulating the counters enabled doesn't change this.

 Any assistance would be much appreciated. My application runs only
 ~10% faster on a Tesla C1060 than a G80 Quadro (despite having twice
 as many MPs) so I'm hoping the profiler will help me understand why.

On Linux, I've had good success with just using the profiler from the command 
line:

http://webapp.dam.brown.edu/wiki/SciComp/CudaProfiling

Every one of my attempts to achieve the same thing using the visual profiler 
has ended in tears so far. I'm not sure if the command line way of doing 
things works in Windows, but I'd imagine so.

Once you figure out what's up, please add an FAQ entry!

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] a way to probe what globals, especially constant arrays and texture refs are defined in a kernel?

2009-08-12 Thread Andreas Klöckner

On Mittwoch 12 August 2009, Andrew Wagner wrote:
 Hi-

 Is there a way to get a list of the global variables, especially
 constant arrays and texture refs that are defined in a kernel?

 I'm generating a pycuda.driver.Module from a template, and the storage
 of various kernel inputs depends on the template parameters.

 It would be convenient for code using a kernel generated this way to
 have some way of figuring out what global variables are defined in the
 kernel, and whether they are globals, constants, or texrefs.

 Maybe in a future version of pycuda it would be nice to replace (or
 provide an alternative to) the accessor functions:

 pycuda.driver.Module.get_global
 pycuda.driver.Module.get_function
 pycuda.driver.Module.get_texref

 that consists of having member variables:

 pycuda.driver.Module.globals
 pycuda.driver.Module.functions
 pycuda.driver.Module.constants
 pycuda.driver.Module.texrefs

 that are already initialized to dictionaries with the name of the
 variable as the key, and the handle  (or maybe a (handle, size) tuple)
 as the value.

 or maybe have a single member variable pycuda.driver.Module.globals
 that is a dictionary with variable names as keys, and a (type, handle,
 size) tuple or something similar.

 If I at least have the name of the variable I think I can deduce if
 the variable is defined as a __constant__ array  by wrapping
 pycuda.driver.Module.get_global in a try: statement, but that's rather
 un-pythonic

 Or perhaps I'm misunderstanding something and the Module.get_*
 functions are forced on us by the CUDA  API?

Sorry, no way to do that by the CUDA API. One could potentially parse the 
CUBIN file, but that's rather fragile and not something that PyCUDA engages 
in, so far.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [PyCuda] Installation wiki updated with Windows Vista 64 bit install with Visual Studio 2008

2009-08-12 Thread Andreas Klöckner

On Mittwoch 12 August 2009, wtftc wrote:
 I've updated the wiki with my configuration of Windows Vista 64 bit with
 Visual Studio 2008. To use it, your entire python stack must also be 64
 bit. The build was x64, also known as amd64.

Thanks!

 I also have a question: is it possible to statically compile all embedded
 kernels in my code with pycuda? Deploying a program with pycuda widely with
 because it requires the cuda and c++ build tools, which are heavy. It would
 be nice to have an option to generate a library at build time that could
 then be packaged and installed without having to do the heavy c lifting.

That's very possible--this would amount to preseeding the PyCUDA compiler 
cache. I'd certainly merge a patch that implements this. The PyCUDA compiler 
cache logic is ~200 lines, so this should be easy to add.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] non-egg install broken with 0.93

2009-08-12 Thread Andreas Klöckner

On Mittwoch 12 August 2009, Ryan May wrote:
 Hi,

 I was trying to install pycuda today from source (in advance of the Scipy
 tutorial!), and have noticed a problem if I don't use eggs to install.  If
 I use:

 python setup.py install --single-version-externally-managed --root=/

 I get the pycuda header file installed under usr/include/cuda.  This breaks
 the logic in compiler._find_pycuda_include_path().

 I personally avoid eggs, so this creates a problem.  Also, many linux
 package managers (including Gentoo, my own distro) avoid eggs, and I know
 for a fact that Gentoo uses this same method to install packages.  I've
 hacked my own compiler.py to work, but I'm not sure what a good solution
 really would be.  Gentoo has 0.92 packaged, but I don't think the header
 was used in that version and thus didn't present any problems.

I've committed a (somewhat hacky) fix: Automatically check /usr and /usr/local 
on Linux. Let me know if that works for you. 

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Installing PyCuda 0.93 on CentOs 5.3

2009-08-17 Thread Andreas Klöckner

On Montag 17 August 2009, Christian Quaia wrote:
 Hi.
  I've been trying to install pycuda on my centos 5.3 box, but I haven't had
 much success. I managed to install boost (1.39) as per instructions, but
 when I build pycuda I get the following error:

 building '_driver' extension
 gcc -pthread -fno-strict-aliasing -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
 -mtune=generic -D_GNU_SOURCE -fPIC -O3 -DNDEBUG -fPIC -Isrc/cpp
 -I/usr/include/boost-1_39 -I/usr/local/cuda/include
 -I/usr/lib64/python2.4/site-packages/numpy/core/include
 -I/usr/include/python2.4 -c src/wrapper/wrap_cudadrv.cpp -o
 build/temp.linux-x86_64-2.4/src/wrapper/wrap_cudadrv.o

 src/wrapper/wrap_cudadrv.cpp: In function
 ‘intunnamed::function_get_lmem(const cuda::function)’:
 src/wrapper/wrap_cudadrv.cpp:165: error: ‘PyErr_WarnEx’ was not
 declared in this scope
 ...
 error: command 'gcc' failed with exit status 1


 I looked around for a solution, but I couldn't find any. I'm using
 Python 2.4.3,
 and my default gcc version is 4.1.2 (although I had to compile boost
 using gcc43)

Try the git version, PyErr_WarnEx is not referenced any more.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Installing PyCuda 0.93 on CentOs 5.3

2009-08-17 Thread Andreas Klöckner

That says that you're linking against the boost you built with gcc 4.3--
rebuild boost with 4.1, that you should get you a step further.

Andreas

On Dienstag 18 August 2009, Christian Quaia wrote:
 Thanks Andreas.

 Sorry, I should have tried that before... Now the build and install
 work. However, when I run the tests I get another error:

 Traceback (most recent call last):
   File ./test/test_driver.py, line 475, in ?
 import pycuda.autoinit
   File
 /usr/lib64/python2.4/site-packages/pycuda-0.94beta-py2.4-linux-x86_64.egg/
pycuda/autoinit.py, line 1, in ?
 import pycuda.driver as cuda
   File
 /usr/lib64/python2.4/site-packages/pycuda-0.94beta-py2.4-linux-x86_64.egg/
pycuda/driver.py, line 1, in ?
 from _driver import *
 ImportError: /usr/lib64/libboost_python-gcc43-mt-1_39.so.1.39.0:
 undefined symbol:
 _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3
_l


 Thanks again for your help,
 Christian


 On Mon, Aug 17, 2009 at 5:41 PM, Andreas

 Klöcknerli...@informa.tiker.net wrote:
  On Montag 17 August 2009, Christian Quaia wrote:
  Thanks Andreas.
 I got the git version, and things went a bit better, but I still have
  a problem.
 
  When I build pycuda now I get this error:
 
  /usr/include/boost-1_39/boost/type_traits/detail/cv_traits_impl.hpp:37:
  internal compiler error: in make_rtl_for_nonlocal_decl, at
  cp/decl.c:5067
 
 
  This is due to a bug in gcc 4.1.2, which was fixed in later versions.
  For this very reason I
  had to compile boost using gcc43, which is also installed on my
  machine in parallel to gcc.
  Is there a simple way, like for boost, to force pycuda to be built
  using gcc43 as a compiler?
 
  Not easily, as Python should be built with a matching compiler.
 
  But see this here for gcc 4.1 help:
  http://is.gd/2lCd7
 
  Andreas
 
  ___
  PyCUDA mailing list
  PyCUDA@tiker.net
  http://tiker.net/mailman/listinfo/pycuda_tiker.net

 ___
 PyCUDA mailing list
 PyCUDA@tiker.net
 http://tiker.net/mailman/listinfo/pycuda_tiker.net



signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] problem with pycuda._driver.pyd

2009-09-17 Thread Andreas Klöckner

On Donnerstag 17 September 2009, mailboxalpha wrote:
  I looked for DLL files used in _driver.pyd and one of them is named
  nvcuda.dll.  There is no such file on my machine.  Perhaps that is the DLL
  file that could not be found. The required boost dlls have been copied to
  windows\system32 directory and the boost lib  directory has been added to
  the system path.

Can you please check which DLL the CUDA examples require? There will be one 
that has the runtime interface, which will likely be called something with 
cudart. You don't care about that one. Instead, that DLL in turn requires 
the driver interface, and that's the DLL name we need.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCUDA] Nvidia GTC: PyCUDA talk, Meetup

2009-09-23 Thread Andreas Klöckner

Hi all,

If you are attending Nvidia's GPU Technology Conference next week, there are 
two things I'd like to point out:

- I'll be giving a talk about PyCUDA on Friday, October 2 at 2pm, where I'll 
both introduce PyCUDA and talk about some exciting new developments. The talk 
will be 50 minutes in length, and I'd be happy to see you there.

- Also, I'd like to propose a PyCUDA meetup on Thursday, October 1 at noon. 
(ie. lunchtime) I'll be hanging out by the Terrace seminar room around that 
time. I'm looking forward to meeting some of you in person.

See you next week,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Nvidia GTC: PyCUDA talk, Meetup

2009-09-23 Thread Andreas Klöckner

On Mittwoch 23 September 2009, Andreas Klöckner wrote:
 Hi all,
 
 If you are attending Nvidia's GPU Technology Conference next week, there
  are two things I'd like to point out:
 
 - I'll be giving a talk about PyCUDA on Friday, October 2 at 2pm, where
  I'll both introduce PyCUDA and talk about some exciting new developments.
  The talk will be 50 minutes in length, and I'd be happy to see you there.

Whoops--that's wrong. I just realized the talk is at 1pm on Friday. Sorry for 
the noise.

 - Also, I'd like to propose a PyCUDA meetup on Thursday, October 1 at noon.
 (ie. lunchtime) I'll be hanging out by the Terrace seminar room around
  that time. I'm looking forward to meeting some of you in person.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] 2d scattered data gridding

2009-10-06 Thread Andreas Klöckner

Hi Roberto,

On Freitag 02 Oktober 2009, Roberto Vidmar wrote:
   I wonder if it is possible to use PyCUDA to grid on a two dimensional
 regular grid xyz scattered data. Our datasets are usually quite large
 (some millions of points) .
 
 Many thanks for any help in this topic.

As usual, if something is possible with CUDA in general, it's also possible 
with PyCUDA. In this specific case, I'm not sure what you mean by gridding--
making a grid-based histogram, binning, or perhaps something entirely 
different? Nonetheless, it seems likely that what you want can be (and likely 
has been) done with CUDA.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCUDA] On Python 2.6.3 / distribute / setuptools

2009-10-15 Thread Andreas Klöckner

Hi all,

--
This is relevant to you if you are using Python 2.6.3 and you are getting 
errors of the sort:

/usr/local/lib/python2.6/dist-packages/setuptools-0.6c9-
py2.6.egg/setuptools/command/build_ext.py,
line 85, in get_ext_filename
KeyError: '_cl'
--

It seems Python 2.6.3 broke every C/C++ extension on the planet that was 
shipped using setuptools (which includes PyCUDA, PyOpenCL, and many more of my 
packages.) Thanks for your patience as I've worked through this mess, and to 
both Allan and Christine, I'm sorry you've had to deal with this, and thanks 
to Allan for pointing me in the right direction.

To make a long story short, I've switched my packages (including PyCUDA) to 
use distribute instead of setuptools. All these changes are now in git. I'm 
not sure this will help if a 2.6.3 user already has setuptools installed, but 
I hope it will at least not make any other case worse. All in all, this seems 
like the least bad option given that I expect distribute to be the way of the 
future.

Before I unleash this change full-scale, I would like it to get some testing. 
For this purpose, I've created a PyCUDA release candidate package, here:

http://pypi.python.org/pypi/pycuda/0.93.1rc1

PLEASE TEST THIS, and speak up if you do--both positive and negative comments 
are much appreciated.

Andreas

PS: Once I have reasonable confirmation that this works for PyCUDA, I'll also 
release updated versions of PyOpenCL, meshpy, boostmpi, pyublas,  The 
relevant changes are *already in git* if you'd like to try them now.

PPS: Deciding in favor of distribute and against the promised setuptools 
update was based on two factors:
- Primarily, distribute makes a fix for the 2.6.3 issue available *now*.
- Secondarily, I personally disliked the behavior of PJE, the author of 
setuptools, in response to the current mess.


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [Hedge] On Python 2.6.3 / distribute / setuptools

2009-10-15 Thread Andreas Klöckner

On Donnerstag 15 Oktober 2009, Andreas Klöckner wrote:
 Hi all,
 
 --
 This is relevant to you if you are using Python 2.6.3 and you are getting
 errors of the sort:
 
 /usr/local/lib/python2.6/dist-packages/setuptools-0.6c9-
 py2.6.egg/setuptools/command/build_ext.py,
 line 85, in get_ext_filename
 KeyError: '_cl'
 --

A quick addition: If you are already encountering this error, you need to 
*remove* setuptools before the fix will work for you.

That means that if you do import setuptools on the Python shell and it 
succeeds, type setuptools.__file__ to see where it is installed and get rid 
of it, then start over. (After the fix has worked, it will say somehting with 
distribute in the path for the setuptools.__file__. That's fine.)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] [Hedge] On Python 2.6.3 / distribute / setuptools

2009-10-15 Thread Andreas Klöckner

On Donnerstag 15 Oktober 2009, Darcoux Christine wrote:
 Hi Andreas
 
  0.93.1rc1 seems to works for me, except that I had to download a
  distribute_setup.py file. Could you include that file in the source
  tarball ? I think this is the usual way to work with Distribute, since
  the documentation say :

Whoops. Good point. This should be fixed in 0.93.1rc2, here:
http://pypi.python.org/pypi/pycuda/0.93.1rc2

  I ran the examples/demo.py with success, but the transpose demo
  crashed. I assume this is not related to the use of Distribute, but
  here is the trace :
 
   File transpose.py, line 205, in module
 run_benchmark()
   File transpose.py, line 165, in run_benchmark
 target = gpuarray.empty((size, size), dtype=source.dtype)
   File /usr/lib/python2.6/site-packages/pycuda/gpuarray.py, line 81,
  in __init__
 self.gpudata = self.allocator(self.size * self.dtype.itemsize)
  pycuda._driver.MemoryError: cuMemAlloc failed: out of memory

Well, I'm not sure transpose.py adapts to the amount of memory you have. It 
works on my ~900M card. If you'd like to prepare a fix, I'd certainly merge 
it.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Has anyone made a working parallel prefix sum/scan with pycuda?

2009-10-20 Thread Andreas Klöckner

On Montag 19 Oktober 2009, Michael Rule wrote:
 I'm convinced that I need a prefix scan that gives me access to the
 resultant prefix scanned array.
 
 so, for example, using addition, I would like a function that takes :
 1 1 1 1 1 1 1 1 1 1
 to
 0 1 2 3 4 5 6 7 8 9
 
 It seems like this data should be generated as an intermeidiate step in
 executing a ReductionKernel. I have not been able to figure out how this
 data is accessed by browsing the GPUArray documentation. Am I missing
 something obvious ?

Parallel Prefix Scan is presently not implemented in PyCUDA. While reduction 
is related, the scan is actually a somewhat different animal. PScan would be a 
most welcome addition to PyCUDA, however. Mark Harris has written a good 
introduction on how to implement it:

http://is.gd/4rXq0

If you decide to follow Mark's guide, almost half your work is already done 
for you--reduction occurs as part of the prefix scan, so you'll be able to 
recycle a fair bit of code.

HTH,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Pitch Linear Memory Textures

2009-10-26 Thread Andreas Klöckner

On Montag 26 Oktober 2009, Robert Manning wrote:
 PyCUDA users,
I've been trying to find how to use pitch 2D linear memory textures
 in pyCUDA and have been unsuccessful.  I've seen C code that uses
 cudaBindTexture2D and similar functions but it is not accessible (to
 my knowledge) by pyCUDA.  Any suggestions?
 
 Thanks,
 Bob Manning

See test_2d_texture() in test/test_driver.py.

HTH,
Andreas




signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] pycuda patch for 'flat' eggs

2009-11-04 Thread Andreas Klöckner

Hi Maarten,

On Dienstag 03 November 2009, you wrote:
 i've been using your pycuda package to play with, and I really like
 it! much more productive than compiling etc..
 I have pycuda installed with --single-version-externally-managed and a
 different prefix. This causes pycuda not to find the header files.
 I've attached the diff and new compiler.py file to fix this.

Merged in release-0.93 and master.

Thanks for the patch,
Andreas

PS: Please direct stuff like this to the mailing list next time.


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Install PyCUDA on opensuse 11.1(x86_64)

2009-11-13 Thread Andreas Klöckner

On Donnerstag 12 November 2009, Jonghwan Rhee wrote:
 Hi there,
 
 I have tried to install pycuda on opensuse 11.1. However, when I did
 build, the following error occurred.

What version of Boost do you have, how and where was it installed?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Install PyCUDA on opensuse 11.1(x86_64)

2009-11-13 Thread Andreas Klöckner

Check out virtualenv.

Andreas

PS: *Please* try to keep the replies on-list. Thanks.

On Freitag 13 November 2009, you wrote:
 Hi Andreas,
 
 Thanks for your help. It worked well. But another problem occurred when I
  do make install as follows.
 
 ctags -R src || true
 /usr/bin/python setup.py install
 Extracting in /tmp/tmpKkbJcX
 Now working in /tmp/tmpKkbJcX/distribute-0.6.4
 Building a Distribute egg in /home/jrhee/pycuda
 /home/jrhee/pycuda/setuptools-0.6c9-py2.6.egg-info already exists
 /home/jrhee/pool/include/boost-1_36 /boost/  python .hpp
 *** Cannot find Boost headers. Checked locations:
/home/jrhee/pool/include/boost-1_36/boost/python.hpp
 /home/jrhee/pool/lib / lib boost_python .so
 /home/jrhee/pool/lib / lib boost_python .dylib
 /home/jrhee/pool/lib / lib boost_python .lib
 /home/jrhee/pool/lib /  boost_python .so
 /home/jrhee/pool/lib /  boost_python .dylib
 /home/jrhee/pool/lib /  boost_python .lib
 *** Cannot find Boost Python library. Checked locations:
/home/jrhee/pool/lib/libboost_python.so
/home/jrhee/pool/lib/libboost_python.dylib
/home/jrhee/pool/lib/libboost_python.lib
/home/jrhee/pool/lib/boost_python.so
/home/jrhee/pool/lib/boost_python.dylib
/home/jrhee/pool/lib/boost_python.lib
 /home/jrhee/pool/lib / lib boost_thread .so
 /home/jrhee/pool/lib / lib boost_thread .dylib
 /home/jrhee/pool/lib / lib boost_thread .lib
 /usr/local/cuda /bin/  nvcc
 /usr/local/cuda/include /  cuda .h
 /usr/local/cuda/lib / lib cuda .so
 /usr/local/cuda/lib / lib cuda .dylib
 /usr/local/cuda/lib / lib cuda .lib
 /usr/local/cuda/lib /  cuda .so
 /usr/local/cuda/lib /  cuda .dylib
 /usr/local/cuda/lib /  cuda .lib
 /usr/local/cuda/lib /  cuda .so
 /usr/local/cuda/lib /  cuda .dylib
 /usr/local/cuda/lib /  cuda .lib
 /usr/local/cuda/lib /  cuda .so
 /usr/local/cuda/lib /  cuda .dylib
 /usr/local/cuda/lib /  cuda .lib
 *** Cannot find CUDA driver library. Checked locations:
/usr/local/cuda/lib/libcuda.so
/usr/local/cuda/lib/libcuda.dylib
/usr/local/cuda/lib/libcuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
/usr/local/cuda/lib/cuda.so
/usr/local/cuda/lib/cuda.dylib
/usr/local/cuda/lib/cuda.lib
 *** Note that this may not be a problem as this component is often
  installed system-wide.
 running install
 Checking .pth file support in /usr/local/lib64/python2.6/site-packages/
 error: can't create or remove files in install directory
 
 The following error occurred while trying to add or remove files in the
 installation directory:
 
 [Errno 2] No such file or directory:
 '/usr/local/lib64/python2.6/site-packages/test-easy-install-24814.pth'
 
 The installation directory you specified (via --install-dir, --prefix, or
 the distutils default setting) was:
 
 /usr/local/lib64/python2.6/site-packages/
 
 This directory does not currently exist.  Please create it and try again,
  or choose a different installation directory (using the -d or
  --install-dir option).
 
 make: *** [install] Error 1
 
 
 Jong
 
 On Sat, Nov 14, 2009 at 12:51 PM, Andreas Klöckner
 
 li...@informa.tiker.netwrote:
  On Freitag 13 November 2009, Jonghwan Rhee wrote:
   Hi Andreas,
  
   Its version is boost 1.36 and it was installed at /usr/include/boost/
   by YaST package repositories.
 
  According to http://packages.opensuse-community.org, it appears that
  opensuse
  uses non-standard names for the Boost libraries. Stick
 
  BOOST_PYTHON_LIBNAME=boost_python
  BOOST_THREAD_LIBNAME=boost_thread
 
  into your siteconf.py. That should get you a step further.
 
  HTH,
  Andreas
 
  PS: Please take care to keep replies on the mailing list.
 
  ___
  PyCUDA mailing list
  PyCUDA@tiker.net
  http://tiker.net/mailman/listinfo/pycuda_tiker.net
 



signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] autotuning

2009-11-20 Thread Andreas Klöckner

On Freitag 20 November 2009, James Bergstra wrote:
 Now that we're taking more advantage of PyCUDA's and CodePy's ability
 to generate really precise special-case code... I'm finding that we
 wind up with a lot of ambiguities about *which* generator should
 handle a given special case.  The right choice for a particular input
 structure is platform-dependent--a function of cache sizes, access
 latencies, transfer bandwidth, register counts, number of processors,
 etc, etc.  The wrong choice can carry a big performance penalty.
 
 FFTW and ATLAS get around this by self-tuning algorithms, which I
 don't understand in detail, but which generally work by trying a lot
 of generators on a lot of special cases, and then using the database
 of timings to make good choices quickly at runtime.

What has worked well for me is to try a big bunch of kernels right before 
their intended use and cache which one was fast for this special case only. 
The main delay is the compilation of all these kernels, the trial runs are all 
very quick, thanks to the GPU. There's just enough caching at each level to 
make this efficient.

 It seems like this automatic-tuning is even more important for GPU
 implementations than for CPU ones.  

That certainly echos one claim from the PyCUDA article. :) 

 Are there libraries to help with this?

First of all, since it's a thorny (and unsolved) problem, PyCUDA doesn't try 
to get involved in it. Support it--yes, involved--no. That said, I'm not aware 
of libraries that make autotunig significantly easier. Nicolas mentioned that 
he's eyeing some machine learning techniques like the ones in Milepost gcc. 
Nicolas, care to comment? Aside from that, Cray's grouped, attributed 
orthogonal search [1] sounds useful.

[1] http://iwapt.org/2009/slides/Adrian_Tate.pdf

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] trouble with pycuda-0.93.1rc2 - test_driver.py, etc.

2009-11-20 Thread Andreas Klöckner

On Freitag 20 November 2009, Janet Jacobsen wrote:
 Hi, Andreas.  I ran
 
 ldd /usr/common/usg/python/2.6.4/lib/python2.6/site-packages
 /pycuda-0.93.1rc2-py2.6-linux-x86_64.egg/pycuda/_driver.so
 
 and got
 
  libboost_python.so.1.40.0 =
  /usr/common/usg/boost/1_40_0/pool/lib/libboost_python.so.1.40.0
  (0x2b259cac6000)
  libboost_thread.so.1.40.0 =
  /usr/common/usg/boost/1_40_0/pool/lib/libboost_thread.so.1.40.0
  (0x2b259cd15000)
  libcuda.so.1 = /usr/lib64/libcuda.so.1 (0x2b259cf43000)
  libstdc++.so.6 = /usr/common/usg/gcc/4.4.2/lib64/libstdc++.so.6
  (0x2b259d3df000)
  libm.so.6 = /lib64/libm.so.6 (0x2b259d6eb000)
  libgcc_s.so.1 = /usr/common/usg/gcc/4.4.2/lib64/libgcc_s.so.1
  (0x2b259d96e000)
  libpthread.so.0 = /lib64/libpthread.so.0 (0x2b259db85000)
  libc.so.6 = /lib64/libc.so.6 (0x2b259dda)
  libutil.so.1 = /lib64/libutil.so.1 (0x2b259e0f6000)
  libdl.so.2 = /lib64/libdl.so.2 (0x2b259e2fa000)
  librt.so.1 = /lib64/librt.so.1 (0x2b259e4fe000)
  libz.so.1 = /usr/lib64/libz.so.1 (0x2b259e707000)
  /lib64/ld-linux-x86-64.so.2 (0x00316ec0)
 
 Does this help?

Hmm, yes and no. I'm starting to believe that Boost built itself thinking that 
you have an UCS4 Python, while your actual build is UCS2. To confirm that 
latter point, run the equivalent of

objdump -T /users/kloeckner/mach/x86_64/pool/lib/libpython2.6.so.1.0|grep UCS

That should tell you what the UCS'iness of your custom Python is. Then run

objdump -T /usr/common/usg/boost/1_40_0/pool/lib/libboost_python.so.1.40.0 | 
grep UCS

to establish Boost's UCS'iness. As I said, I'm suspecting that the two might 
disagree. You might want to try that against your system Python 2.4, too. 
Maybe Boost cleverly found that one and picked it up. In any case, the switch 
to look for is Py_UNICODE_SIZE in pyconfig.h.

 P.S. Sorry if this should be off-list, but email sent to
 li...@monster.tiker.net is returned to me.

On-list is the right place IMO--this creates a searchable record of problems 
and solutions. Thanks for asking though. (Btw: where'd you get that email 
address? While tiker.net is my domain, I don't recall using or having created 
that address.)

HTH,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Build Problems on SuSE 11.0, x86_64 - cannot find -llibboost_python-mt

2009-11-21 Thread Andreas Klöckner

On Samstag 21 November 2009, Wolfgang Rosner wrote:
 Hello, Andreas Klöckner,
 
 I can't get pycuda to build on my box.
 I tried at least 10 times, try to reproduce details now.
 In the archive, there was a similar thread with a SuSE 11.1 64 bit box ,
 posted by Jonghwan Rhee
 http://article.gmane.org/gmane.comp.python.cuda/955
 but that did not really provide a solution.
 
 cannot find -llibboost_python-mt
 although the file is in place

First, all that configure.py does is edit siteconf.py--no need to rerun it 
once sitconf.py is in place.

Second, -lsomething implicitly looks for libsomething.so. No need to specify 
the lib prefix.

HTH,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Build Problems on SuSE 11.0, x86_64 - cannot find -llibboost_python-mt

2009-11-21 Thread Andreas Klöckner

On Samstag 21 November 2009, Wolfgang Rosner wrote:
 OK, for me it works now, but peomple might be even more (and earlier) happy
  if  the pytools issue had been mentioned in the setup wiki.

Pytools should be installed automatically along with 'python setup.py 
install'. If it didn't: do you have any idea why?

 If you like and give me an wiki account, I'd go to share my experience.

No account required. (though you can create one yourself) Please do share.

Andreas



signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Build Problems on SuSE 11.0, now Pytools not found

2009-11-21 Thread Andreas Klöckner

On Samstag 21 November 2009, Wolfgang Rosner wrote:
  Pytools should be installed automatically along with 'python setup.py
  install'. If it didn't: do you have any idea why?
 
 not sure.
 
 could it be that I ran make install
 instead of python setup.py install ?

make install invokes python setup.py install. That shouldn't have been it.

 (sorry, I'm just getting used to Python, preferred perl in earlier live)
 
 first I thought it was due to the different python path structures.
 
 standard on SuSE is
 /usr/lib64/python2.5/site-packages/
 
 but the egg-laying machine seems to put stuff to
 /usr/local/lib64/python2.5/site-packages
 instead
 (maybe I could have reconfigured this, anyway)

Weird. Curious about the reasoning behind this.

 However, gl_interop.py did not run until I did
 export PYTHONPATH=/usr/local/lib64/python2.5/site-packages/
 (was PYTHONPATH= before)
 
 maybe this is since there is still the old python-opengl-2.0.1.09-224.1
 /usr/lib64/python2.5/site-packages/OpenGL/GL/ARB/
  ...with-no-vertex-buffer-in-there in the way which is caught before.
 
 But to figure it out I'm definitely lacking sufficient python experience.

There is an easy trick to find out what file path actually gets imported:

 import pytools
 pytools.__file__
'/home/andreas/research/software/pytools/pytools/__init__.py'

 hm, might give it a try.
 I think best I could offer was be to prepare an own SuSE page with my
 experience.

Sure--just add a subpage under
http://wiki.tiker.net/PyCuda/Installation/Linux
(like the one for Ubuntu)

 It all comes down to different ways and places where stuff is stored.
 But I think my approach is not the best one, in the view back it were
  better to configure new stuff so that it meets SuSE structure. Maybe.
 Well, but this might break other dependencies?
 Smells like big 'Baustelle'...
 
 So if your expectation of quality on your wiki is not too high, I'll post
  my experience there.

That's the whole point of a Wiki: information of questionable quality that 
people improve as they use it. It's a knowledge retention tool.

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Install issue with Ubuntu 9.10

2009-11-22 Thread Andreas Klöckner

On Sonntag 22 November 2009, you wrote:
 Now this error:
 
 python test_driver.py
 Traceback (most recent call last):
   File test_driver.py, line 481, in module
 from py.test.cmdline import main
 ImportError: No module named cmdline

Again, py.test should have been installed automatically for you.

easy_install -U py

should do that for you. (Also see http://is.gd/4VBoW)

Andreas

PS: Please keep your replies on-list.


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] CUDA 3.0 64-bit host code

2009-11-23 Thread Andreas Klöckner

Hey Bryan,

On Montag 23 November 2009, Bryan Catanzaro wrote:
 I built 64-bit versions of Boost and PyCUDA on Mac OS X Snow Leopard, as
  well as the 64-bit Python interpreter supplied by Apple, as well as the
  CUDA 3.0 beta.  Everything built fine, but when I ran pycuda.autoinit, I
  got an interesting CUDA error, which PyCUDA reported as pointer is
  64-bit. I'm wondering - is it impossible to use a 64-bit host program
  with a 32-bit GPU program under CUDA 3.0?

First, I'm not sure I fully understand what's going on. You can indeed compile 
GPU code to match a 32-bit ABI on a 64-bit machine (nvcc --machine 32 ...). Is 
that what you're doing? If so, why? (Normally, nvcc will default to your 
host's ABI. By and large, this changes struct alignment rules and pointer 
widths.)

If you're not doing anything special to get 32-bit GPU code, then your GPU 
code should end up matching your host ABI. Or maybe nvcc draws the wrong 
conclusions or is a fat binary or something and we need to actually specify 
the --machine flag.

I also remember wondering what the error message referred to when I added it. 
I'm totally not sure. Which routine throws it?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] bad argument to internal function

2009-11-25 Thread Andreas Klöckner

On Mittwoch 25 November 2009, Ken Seehart wrote:
 Something like this came up for someone else in May:
 http://www.mail-archive.com/pycuda@tiker.net/msg00361.html
 
 *SystemError: ../Objects/longobject.c:336: bad argument to internal
 function*
   buf = struct.pack(format, *arg_data)
 
 I fixed it by hacking *_add_functionality* in *driver.py*.

As before when Bryan reported the bug, I can't seem to reproduce it. 
(Likewise, I can't reproduce the issue described in the corresponding numpy 
ticket [1].)

[1] http://projects.scipy.org/numpy/ticket/1110

What architecture are you running on? What version of Python? What version of 
numpy? (Can't reproduce on x86_64+2.5.4+1.3.0 and x86_64+2.6.1+{1.2.1 and 
1,3.0}.)

In principle, I'm not opposed to merging this fix, but I'd like some more 
information first.

Bryan: are you still encountering this? Any further information?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Question about gpuarray

2009-12-04 Thread Andreas Klöckner

On Freitag 04 Dezember 2009, Bryan Catanzaro wrote:
 Thanks for the explanation.  In that case, do you have objections to
  removing the assertion that if a GPUArray is created and given a
  preexisting buffer, that the new array must be a view of another array? 
  In my situation, I don't think this assertion is true:  I would like to
  transfer ownership of a gpu buffer (created outside of PyCUDA by some host
  code) to a particular GPUArray.  This means I instantiate a GPUArray with
  gpudata=the pointer created by the host code, but base should still be
  None, since this new GPUArray is not a view of any other array, and so
  this GPUArray should have sole ownership of the buffer being given at
  initialization.

If I understand you correctly, then whatever you assign to .gpudata already 
establishes the lifetime dependency, right? In that case, yes, the assert 
should go away.

Andreas

PS: Please keep the PyCUDA list cc'ed, unless there's good reason not to.


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Runtime problem: cuDeviceGetAttribute failed: not found

2009-12-08 Thread Andreas Klöckner

On Dienstag 08 Dezember 2009, you wrote:
 It's the first time I've installed the drivers, so I don't have multiple
 versions. Anyway how can I know the versions of the headers used in the
 compilation?

pycuda.driver.get_version()
pycuda.driver.get_driver_version()

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] cuModuleGetFunction failed: not found

2009-12-14 Thread Andreas Klöckner

On Montag 14 Dezember 2009, Robert Cloud wrote:
 Traceback (most recent call last):
   File stdin, line 1, in module
 pycuda._driver.LogicError: cuModuleGetFunction failed: not found

The problem is that nvcc compiles code as C++ by default, which means it uses 
name mangling [1].

If you don't want to use PyCUDA's just-in-time compilation facilities [2], 
then just add an 'extern C' to your declarations.

Andreas

[1] http://en.wikipedia.org/wiki/Name_mangling
[2] http://documen.tician.de/pycuda/driver.html#module-pycuda.compiler


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Passing struct into kernel

2009-12-18 Thread Andreas Klöckner

On Freitag 18 Dezember 2009, Dan Piponi wrote:
 I have no problem making a struct in global memory and passing in a
 pointer to it. But arguments to kernels get stored in faster memory
 than global memory don't they?

Right. You can pack a struct into a string using Python's struct module, and 
since string support the buffer protocol, they can be passed to PyCUDA 
routines.

HTH,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Passing struct into kernel

2009-12-19 Thread Andreas Klöckner

On Freitag 18 Dezember 2009, Dan Piponi wrote:
 go.prepare(s, ...)   # tell PyCuda we're passing in a string buffer
 go.prepared_call(grid, struct.pack(i,12345))# pack integer into
 string buffer

struct_typestr = i
sz = struct.calcsize(struct_typestr)
go.prepare(%ds % sz, ...)   # tell PyCuda we're passing in a string buffer
go.prepared_call(grid, struct.pack(struct_typestr, 12345))

codepy.cgen.GenerableStruct can help you generate type strings and C source 
code from a single source and will also help you make packed instances.

HTH,
Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] CompileError

2009-12-21 Thread Andreas Klöckner

On Montag 21 Dezember 2009, Ewan Maxwell wrote:
 test_driver.py works(all 16 tests), the example code i mentioned works to
 however, other tests still fail with the same reason(nvcc fatal~)

That's strange. If all of them have the same path, why would some fail and 
some succeed?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] PyCUDA Digest, Vol 18, Issue 12

2009-12-21 Thread Andreas Klöckner

On Montag 21 Dezember 2009, oli...@olivernowak.com wrote:
 like i said.
 
 a week.

Can you comment on what made this difficult? Can we do anything to make this 
easier, e.g. by catching common errors and providing more helpful messages?

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] PyCuda installation instructions for Gentoo Linux

2009-12-22 Thread Andreas Klöckner

On Dienstag 22 Dezember 2009, Justin Riley wrote:
 Hi All,
 
 I've added a page for installing PyCuda on Gentoo Linux to the PyCuda wiki.

Sweet, thanks!

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] pycuda.driver.Context.synchronize() delay time is a function of the count and kind of sram accesses?

2010-01-02 Thread Andreas Klöckner

Couple points:

* bytewise smem write may be slow?
* sync before and after timed operation, otherwise you time who knows what
* or, even better, use events.

HTH,
Andreas


On Samstag 02 Januar 2010, Hampton G. Miller wrote:
 I have noticed something which seems odd and which I hope you will look at
 and then tell me if it is something unique to PyCUDA or else is something
 which should be brought to the attention of Nvidia.  (Or, that I am just a
 simpleton!)
 
 Looking at my test results, below, and referring to my attached Python
 program with comments, it seems to me that the amount of time taken by
 pycuda.driver.Context.
 synchronize() is strongly a function of the count and type of sram
 accesses.  This seems odd to me.  Do you agree?
 
 For example, it takes over 13 seconds to sync after doing nothing more than
 writing zeros to (almost) all of the sram bytes for a 512x512 grid!
 
 Regards, Hampton
 
 
 PyCUDA 0.93 running on Mint 7 Linux
 
 Using device GeForce 9800 GT
gridDim_x   gridDim_y  blockDim_x  blockDim_y  blockDim_z
 A B C D E F G H
  0:1   1   1   1   1
 0.001050  0.000120  0.000442  0.72  0.000257  0.69  0.68
 0.69
  1:1   1 512   1   1
 0.000828  0.72  0.000441  0.73  0.000257  0.70  0.69
 0.69
  2:1 100 512   1   1
 0.007309  0.000167  0.003026  0.000106  0.001546  0.72  0.72
 0.72
  3:  100   1 512   1   1
 0.005985  0.77  0.003016  0.71  0.001543  0.73  0.72
 0.71
  4:  100 100 512   1   1
 0.526857  0.000303  0.263423  0.000302  0.131828  0.000304  0.000311
 0.000210
  5:1 256 512   1   1
 0.014104  0.000167  0.007073  0.75  0.003572  0.76  0.76
 0.73
  6:  256   1 512   1   1
 0.014087  0.81  0.007069  0.77  0.003570  0.93  0.77
 0.73
  7:  256 256 512   1   1
 3.447902  0.001038  1.724391  0.001039  0.862664  0.001041  0.001586
 0.000957
  8:1 512 512   1   1
 0.027301  0.61  0.013667  0.46  0.006857  0.45  0.50
 0.44
  9:  512   1 512   1   1
 0.027314  0.000125  0.013669  0.47  0.006855  0.45  0.49
 0.44
 10:  512 512 512   1   1
 13.789054  0.003796  6.896283  0.003800  3.449923  0.003794  0.006229
 0.003898
 31.298553 secs total
 
 #!/usr/bin/env python
 
 # nvidia_example.py -
 
 import sys
 import os
 import time
 import numpy
 
 import pycuda.autoinit
 import pycuda.driver as cuda
 from   pycuda.compiler import SourceModule
 
 
 gridDim_x  = 1
 gridDim_y  = 1
 
 blockDim_x = 1
 blockDim_y = 1
 blockDim_z = 1
 
 gridBlockList = [
 (1,  1,   1,1,1), (  1,1, 512,1,1),
 (1,100, 512,1,1), (100,1, 512,1,1), (100,100, 512,1,1),
 (1,256, 512,1,1), (256,1, 512,1,1), (256,256, 512,1,1),
 (1,512, 512,1,1), (512,1, 512,1,1), (512,512, 512,1,1) ]
 
 #
 ===
 ===
 
 cuda.init()
 device = pycuda.tools.get_default_device()
 print Using device, device.name()
 
 dev_dataRecords = cuda.mem_alloc( 1024 * 15 )
 
 #
 ---
 ---
 
 krnl = SourceModule(
 __global__ void worker_0 ( char * src )
 {
 __shared__ char dst[ (1024 * 15) ];
 int i;
 
 if ( threadIdx.x == 0 )
 {// Case A:
 for( i=0; isizeof(dst); ++i )// Count = sizeof(dst)
 dst[ i ] = 0;// Type = indexed by i
 
 dst[ 0 ] = dst[ 1 ];// (Gag set but never
  used warning message from compiler)
 };
 }
 
 __global__ void worker_1 ( char * src )
 {
 __shared__ char dst[ (1024 * 15) ];
 int i;
 
 if ( threadIdx.x == 0 )
 {// Case B:
 for( i=0; isizeof(dst); ++i )// Count = sizeof(dst)
 dst[ 0 ] = 0;// Type = always the same
 element, 0
 
 dst[ 0 ] = dst[ 1 ];
 };
 }
 
 __global__ void worker_2 ( char * src )
 {
 __shared__ char dst[ (1024 * 15) ];
 int i;
 
 if ( threadIdx.x == 0 )
 {// Case C:
 for( i=0; i(sizeof(dst)/2); ++i )// Count = sizeof(dst)/2
 dst[ i ] = 0;// Type = indexed by i
 
 dst[ 0 ] = dst[ 1 ];
 };
 }
 
 __global__ void worker_3 ( char *

Re: [PyCUDA] FFT code gets error only when Python exit

2010-01-06 Thread Andreas Klöckner

n Mittwoch 06 Januar 2010, Ying Wai (Daniel) Fan wrote:
  Now in your situation there's a failure when reactivating the context to
  detach from it, probably because the runtime is meddling about. The only
  reason why cuCtxPushCurrent would throw an invalid value, is, IMO, if
  that context is already somewhere in the context stack. So it's likely
  that the runtime reactivated the context. In current git, a failure of
  the PushCurrent call should not cause a failure any more--it will print a
  warning instead.
 
 I believe different contexts can't share variables that is on the GPUs.

True.

 I can use GPUArray objects as arguments to my fft functions, and these
 objects still exist after fft. So I think fft is using the same context
 as pycuda.

Right. I take it that if runtime functions execute when a driver context
exists, they'll reuse that context.

 I made the change indicated in the attached diff file, such that
 context.synchronize() and context.detach() would print out the context
 stack size, and detach() would also print out whether current context is
 active. With this I verify that the stack size is 1 before and after
 running fft code and the context does not change.

I should clarify here. CUDA operates one context stack, and PyCUDA has
another one. CUDA's isn't sufficient because it will not let the same
context be activated twice. PyCUDA on the other hand needs exactly this
functionality, to ensure that cleanup can happen whenever it is needed.
Hence PyCUDA maintains its own context stack, and keeps CUDA's stack
at most one deep. You are looking at the PyCUDA stack.

 My guess is that CUFFT make some change to the current context, such
 that once this context is poped, it is automatically destroyed.

I disagree. I think CUDA somehow lets us pop the context, does not
report an error, but also does not actually pop the context (since the
runtime is still talking to it). Then, when PyCUDA tries to push the
context back onto CUDA's stack to detach from it, that fails. I've filed
a bug report with Nvidia, we'll see what they say.

 If my
 guess is correct, then calling context.detach() would destroy the
 context, since its usage count drops to 0, and it could circumvent the
 warning message when the context destructor is called.

Pop only removes the context from the context stack (and hence
deactivates it), but retains a reference to it. It should not cause
anything to be destoyed.

 I don't want
 people using my package to see warning message when Python exit, so I am
 not using autoinit in my package, but to create a context explicitly.

Sorry for this mess--I hope we can sort it out somehow.

 The following is kind of unrelated. I have done some experiments with
 contexts. I think context.pop() always pops up the top context from the
 stack, disregarding whether context is really at the top of the stack.
 E.g. I create two contexts c1 and c2 and then I can do c1.pop() twice
 without getting error.

This points to a doc and behavior bug in PyCUDA. Context.pop() should
have been static. It effectively was, but not quite. Fixed in git.

 cuComplex.h exists since CUDA 2.1 and it hasn't changed in subsequent
 version. cuComplex.h is used by cufft.h and cublas.h. I can't find any
 documentation to it. A quick search on google shows that JCuda seems to
 be using it.
 http://www.jcuda.org/jcuda/jcublas/doc/jcuda/jcublas/cuComplex.html

Hmm. A quick poke comes up with an error message:

8 --
kernel.cu(7): error: no operator * matches these operands
operand types are: cuComplex * cuComplex
8 --

Code attached. This might not be what we're looking for.

 Maybe we can simply use complex.h from GNU C library. A quick seach on
 my Ubuntu machine locates the following files:
 /usr/include/complex.h, which includes
 /usr/include/c++/4.4/complex.h, which then includes
 /usr/include/c++/4.4/ccomplex, which in turn includes
 /usr/include/c++/4.4/complex, which includes overloading of operators
 for complex number.

Two words: Windows portability. :) Aside from that, this is unlikely to
work, as the system-wide complex header depends on I/O being available
and all kinds of other system-dependent funniness.

 Good luck on your PhD.

Likewise!

Andreas
import pycuda.driver as drv
import pycuda.tools
import pycuda.autoinit
import numpy
import numpy.linalg as la
from pycuda.compiler import SourceModule

mod = SourceModule(
#include cuComplex.h
__global__ void multiply_them(cuComplex *dest, cuComplex *a, cuComplex *b)
{
  const int i = threadIdx.x;
  dest[i] = a[i] * b[i];
}
)

multiply_them = mod.get_function(multiply_them)

a = numpy.random.randn(400).astype(numpy.complex64)
b = numpy.random.randn(400).astype(numpy.complex64)

dest = numpy.zeros_like(a)
multiply_them(
drv.Out(dest), drv.In(a), drv.In(b),
block=(400,1,1))

print dest-a*b


signature.asc
Description: This

Re: [PyCUDA] FFT code gets error only when Python exit

2010-01-07 Thread Andreas Klöckner

On Donnerstag 07 Januar 2010, Ying Wai (Daniel) Fan wrote:
 I changed cuComplex_mod.h a bit to force the use of complex.h. Looks
 like the route of using GNU C library does not work. Complex arithmetic
 operations are regarded as host functions by CUDA and host functions
 cannot be called from device. I got the following errors:

So I guess shipping the hacked STLport header is our only option then,
right? (Which is not bad--it's decent code.)

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] RuntimeError: could not find path to PyCUDA's C header files on Mac OS X Leopard, CUDA 2.3, pyCuda 0.94beta, python 2.5

2010-01-07 Thread Andreas Klöckner

On Donnerstag 07 Januar 2010, Ian Ozsvald wrote:
 I'm having a devil of a time getting pyCUDA to work on my MacBook and I
 can't get past this error:
 host47:examples ian$ python demo.py
 Traceback (most recent call last):
   File demo.py, line 22, in module
 )
   File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 203, in
 __init__
 arch, code, cache_dir, include_dirs)
   File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 188, in
 compile
 include_dirs = include_dirs + [_find_pycuda_include_path()]
   File /Library/Python/2.5/site-packages/pycuda/compiler.py, line 149, in
 _find_pycuda_include_path
 raise RuntimeError(could not find path to PyCUDA's C header files)
 RuntimeError: could not find path to PyCUDA's C header files
 
 Below I have version info and the full build process leading up to the
 error...Any pointers would be *hugely* appreciated.  If someone could
 explain what's happening here and what pyCUDA is looking for, that might
 point me in the right direction.
 
 Am I missing something silly?  I spent yesterday banging my head against
  the 'make' process until I found a spurious '=' in the ./configuration.py
  arguments (entirely my fault), maybe I've missed something silly here too?

See if you can find pycuda-helpers.hpp under /Library/Python/2.5/site-
packages/ 
somewhere, we may need to adapt _find_pycuda_include_path(). It's quite
interesting to see where all this stuff can end up...

Andreas


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] 0.94 woes

2010-01-07 Thread Andreas Klöckner

On Donnerstag 07 Januar 2010, Nicholas S-A wrote:
 Well, if somebody writes code that uses arch=sm_13 for some reason,
 then somebody who doesn't have a 200 series card tries to use it, then
 the error message which comes up is pretty cryptic. Just trying to make
 it more understandable. It is an error that they should not have made in
 the first place -- so perhaps it should be a LogicError?

Downgraded to a warning, merged.

Thanks for the patch,
Andreas

PS: Please do keep the list cc'd.


signature.asc
Description: This is a digitally signed message part.
___
PyCUDA mailing list
PyCUDA@tiker.net
http://tiker.net/mailman/listinfo/pycuda_tiker.net

1 2 >

1 - 100 of 159 matches

Mail list logo