Re: [PyCUDA] mac64 PyCUDA
Hi Art, Min, Bryan, On Mon, 20 Sep 2010 15:25:24 -0700, Art wrote: > Thanks for posting the fork. I used your modification to compiler.py (my > original one was incorrect) and I built a 64-bit only version of pycuda and > all tests under tests/ passed for the first time. I also was able to call > cublas and cufft using something similar to parret [1]. Thanks very much for getting to the bottom of this pesky problem! (Or that's at least what it seems like to me--right?) I've pulled both Min's and Bryan's fixes into PyCUDA's git. Thanks again, Andreas pgpkAgLdtOfjF.pgp Description: PGP signature ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
On Mon, Sep 20, 2010 at 2:08 PM, MinRK wrote: > Okay, with a tiny tweak to compiler.compile, I have UB pycuda working in > both 64 and 32-bit. All I did was tell compile() to add '-m64' to options > if it detects 64-bit mode, in the same way as Bryan's trick. > > I pushed a branch with the patch to my GitHub: > http://github.com/minrk/pycuda, as well as a siteconf that works for the > build. There are still a few failures in test_gpuarray for 64-bit, but I > don't know what causes them. > > On Mon, Sep 20, 2010 at 11:30, Art wrote: > >> >> On Mon, Sep 20, 2010 at 8:34 AM, Bryan Catanzaro < >> catan...@eecs.berkeley.edu> wrote: >> >>> I think it should be changed to check to see if the Python interpreter is >>> currently running in 32 bit mode, and then compile to match: >>> >>> if 'darwin' in sys.platform and sys.maxint == 2147483647: >>># The Python interpreter is running in 32 bit mode on OS X >>> if "-arch" not in conf["CXXFLAGS"]: >>> conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >>>if "-arch" not in conf["LDFLAGS"]: >>> conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) >>> >>> Some people (myself included) have to run Python in 32-bit mode on 64-bit >>> OS X for various compatibility reasons (currently including libcudart.dylib, >>> which is only shipped as a 32-bit library). Since the Python which Apple >>> ships is compiled as a fat binary with both 32 and 64 bit versions, we can't >>> know a priori what the right compiler flags are. >>> >>> - bryan >> >> >> I have driver version 3.1.14 and: >> >> $ file /usr/local/cuda/lib/libcudart.dylib >> /usr/local/cuda/lib/libcudart.dylib: Mach-O universal binary with 2 >> architectures >> /usr/local/cuda/lib/libcudart.dylib (for architecture x86_64): Mach-O >> 64-bit dynamically linked shared library x86_64 >> /usr/local/cuda/lib/libcudart.dylib (for architecture i386): Mach-O >> dynamically linked shared library i386 >> >> Doesn't that mean it's UB? >> > > You are right, that's UB, I guess the change was made at 3.1, not 3.2. In > 3.0, they introduced UB for libcuda, but left all other libraries as i386. > > >> I can build and test cudamat [1] (which uses ctypes to call libcudart and >> libcublas) fine in 64-bit macports python though I haven't otherwise used >> it. I had to make the following small change in it's Makefile: >> >> nvcc -O -m 64 -L/usr/local/cuda/lib --ptxas-options=-v --compiler-options >> '-fPIC' -o libcudamat.so --shared cudamat.cu cudamat_kernels.cu -lcublas >> >> from: >> >> nvcc -O --ptxas-options=-v --compiler-options '-fPIC' -o libcudamat.so >> --shared cudamat.cu cudamat_kernels.cu -lcublas >> >> I built pycuda as 64-bit and changed pycuda/compiler.py to pass --machine >> 64 to nvcc and got examples/demo.py to run but the other examples and tests >> had failures and would eventually hang my machine. I don't know enough to >> fix this myself but I can try suggestions. Would using 3.2RC make a >> difference? >> > > If the rest of libraries are UB, as cudart is at 3.1, then 3.2 shouldn't > make a difference from 3.1.x, I just knew that the switch was somewhere > between 3.0 and 3.2, and it appears to have been at 3.1. > > -MinRK > > >> >> [1] http://code.google.com/p/cudamat/ >> >> cheers, >> art >> > >> Thanks for posting the fork. I used your modification to compiler.py (my original one was incorrect) and I built a 64-bit only version of pycuda and all tests under tests/ passed for the first time. I also was able to call cublas and cufft using something similar to parret [1]. This is the siteconf.py I used hacked from someone's earlier efforts on this list: BOOST_INC_DIR = ['/opt/local/include'] BOOST_LIB_DIR = ['/opt/local/lib'] BOOST_COMPILER = 'gcc-mp-4.4' # not sure USE_SHIPPED_BOOST = False BOOST_PYTHON_LIBNAME = ['boost_python-mt'] BOOST_THREAD_LIBNAME = ['boost_thread-mt'] CUDA_TRACE = False CUDA_ENABLE_GL = False CUDADRV_LIB_DIR = [] CUDADRV_LIBNAME = ['cuda'] CXXFLAGS = ['-arch', 'x86_64', '-m64', '-isysroot', '/Developer/SDKs/MacOSX10.6.sdk'] LDFLAGS = ['-arch', 'x86_64', '-m64', '-isysroot', '/Developer/SDKs/MacOSX10.6.sdk'] [1] http://www.mathcs.emory.edu/~yfan/PARRET/doc/index.html ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
Okay, with a tiny tweak to compiler.compile, I have UB pycuda working in both 64 and 32-bit. All I did was tell compile() to add '-m64' to options if it detects 64-bit mode, in the same way as Bryan's trick. I pushed a branch with the patch to my GitHub: http://github.com/minrk/pycuda, as well as a siteconf that works for the build. There are still a few failures in test_gpuarray for 64-bit, but I don't know what causes them. On Mon, Sep 20, 2010 at 11:30, Art wrote: > > On Mon, Sep 20, 2010 at 8:34 AM, Bryan Catanzaro < > catan...@eecs.berkeley.edu> wrote: > >> I think it should be changed to check to see if the Python interpreter is >> currently running in 32 bit mode, and then compile to match: >> >> if 'darwin' in sys.platform and sys.maxint == 2147483647: >># The Python interpreter is running in 32 bit mode on OS X >> if "-arch" not in conf["CXXFLAGS"]: >> conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >>if "-arch" not in conf["LDFLAGS"]: >> conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) >> >> Some people (myself included) have to run Python in 32-bit mode on 64-bit >> OS X for various compatibility reasons (currently including libcudart.dylib, >> which is only shipped as a 32-bit library). Since the Python which Apple >> ships is compiled as a fat binary with both 32 and 64 bit versions, we can't >> know a priori what the right compiler flags are. >> >> - bryan > > > I have driver version 3.1.14 and: > > $ file /usr/local/cuda/lib/libcudart.dylib > /usr/local/cuda/lib/libcudart.dylib: Mach-O universal binary with 2 > architectures > /usr/local/cuda/lib/libcudart.dylib (for architecture x86_64): Mach-O > 64-bit dynamically linked shared library x86_64 > /usr/local/cuda/lib/libcudart.dylib (for architecture i386): Mach-O > dynamically linked shared library i386 > > Doesn't that mean it's UB? > You are right, that's UB, I guess the change was made at 3.1, not 3.2. In 3.0, they introduced UB for libcuda, but left all other libraries as i386. > I can build and test cudamat [1] (which uses ctypes to call libcudart and > libcublas) fine in 64-bit macports python though I haven't otherwise used > it. I had to make the following small change in it's Makefile: > > nvcc -O -m 64 -L/usr/local/cuda/lib --ptxas-options=-v --compiler-options > '-fPIC' -o libcudamat.so --shared cudamat.cu cudamat_kernels.cu -lcublas > > from: > > nvcc -O --ptxas-options=-v --compiler-options '-fPIC' -o libcudamat.so > --shared cudamat.cu cudamat_kernels.cu -lcublas > > I built pycuda as 64-bit and changed pycuda/compiler.py to pass --machine > 64 to nvcc and got examples/demo.py to run but the other examples and tests > had failures and would eventually hang my machine. I don't know enough to > fix this myself but I can try suggestions. Would using 3.2RC make a > difference? > If the rest of libraries are UB, as cudart is at 3.1, then 3.2 shouldn't make a difference from 3.1.x, I just knew that the switch was somewhere between 3.0 and 3.2, and it appears to have been at 3.1. -MinRK > > [1] http://code.google.com/p/cudamat/ > > cheers, > art > > > ___ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > > ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] weird behavior with complex input to kernel
On Fri, 17 Sep 2010 14:54:11 -0400, Yiyin Zhou wrote: > Hi, > I was trying to pass some complex valued numbers to a kernel, but somehow it > messed up. Here is an example with GPUArray that can be reproduced on several > of our linux servers: > > initialize... > > import pycuda.gpuarray as gpuarray > import numpy as np > d_A = gpuarray.empty((1,128), np.complex64) > d_A.fill(1+2j) > d_A > The result is correct > > d_A.fill(np.complex64(1+2j)) > d_A > the imaginary part of the resulting array are all zeros > > It's not necessarily a problem with complex64, in some kernels complex64 is > correct, but complex128 is not. > What could be the cause for that? Not my fault: http://projects.scipy.org/numpy/ticket/1617 (just reported) Andreas pgpTG4iNfdl0o.pgp Description: PGP signature ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] bug in __iadd__ and __isub__ in gpuarray
Hi Nicolas, On Mon, 20 Sep 2010 18:21:28 +0200, Nicolas Barbey wrote: > Thanks for this great package which ease so much the use of cuda ! > I think I found out a small bug in the current git version of the code. > If you run the following code in ipython : > > >>> from pycuda import autoinit > >>> from pycuda import driver, compiler, gpuarray, tools > >>> a = gpuarray.ones(16, dtype=float32) > >>> a += 1 Thanks for the patch/report. Fixed in pycuda and pyopencl git. I didn't use your patch because 'return self.__iadd__(-other)' results in two kernel invocations vs. 1 for the way sub is currently implemented. Andreas pgpuNFnrlj9S6.pgp Description: PGP signature ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
On Mon, Sep 20, 2010 at 8:34 AM, Bryan Catanzaro wrote: > I think it should be changed to check to see if the Python interpreter is > currently running in 32 bit mode, and then compile to match: > > if 'darwin' in sys.platform and sys.maxint == 2147483647: ># The Python interpreter is running in 32 bit mode on OS X > if "-arch" not in conf["CXXFLAGS"]: > conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >if "-arch" not in conf["LDFLAGS"]: > conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) > > Some people (myself included) have to run Python in 32-bit mode on 64-bit > OS X for various compatibility reasons (currently including libcudart.dylib, > which is only shipped as a 32-bit library). Since the Python which Apple > ships is compiled as a fat binary with both 32 and 64 bit versions, we can't > know a priori what the right compiler flags are. > > - bryan I have driver version 3.1.14 and: $ file /usr/local/cuda/lib/libcudart.dylib /usr/local/cuda/lib/libcudart.dylib: Mach-O universal binary with 2 architectures /usr/local/cuda/lib/libcudart.dylib (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64 /usr/local/cuda/lib/libcudart.dylib (for architecture i386): Mach-O dynamically linked shared library i386 Doesn't that mean it's UB? I can build and test cudamat [1] (which uses ctypes to call libcudart and libcublas) fine in 64-bit macports python though I haven't otherwise used it. I had to make the following small change in it's Makefile: nvcc -O -m 64 -L/usr/local/cuda/lib --ptxas-options=-v --compiler-options '-fPIC' -o libcudamat.so --shared cudamat.cu cudamat_kernels.cu -lcublas from: nvcc -O --ptxas-options=-v --compiler-options '-fPIC' -o libcudamat.so --shared cudamat.cu cudamat_kernels.cu -lcublas I built pycuda as 64-bit and changed pycuda/compiler.py to pass --machine 64 to nvcc and got examples/demo.py to run but the other examples and tests had failures and would eventually hang my machine. I don't know enough to fix this myself but I can try suggestions. Would using 3.2RC make a difference? [1] http://code.google.com/p/cudamat/ cheers, art ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] SparseSolve.py example
Thanks Andreas - Yes, unfortunately the patterns does change (the code I run grows and shrinks mountain glaciers) so I think I'm out of luck with GPU processing here. I'll consider some other options. Pycuda, nevertheless, is a great package - thanks everyone for your hard work developing and maintaining it. Cheers, -Original Message- From: Andreas Kloeckner [mailto:li...@informa.tiker.net] Sent: Sunday, September 19, 2010 6:07 PM To: Brian Menounos; pycuda@tiker.net Subject: RE: [PyCUDA] SparseSolve.py example Hi Brian, On Tue, 7 Sep 2010 19:56:43 +, Brian Menounos wrote: > Hi Andreas - I realize you're pretty busy answering emails of late, so answer > when you can... Yeah, sorry. Pretty swamped ATM. I hope things will clear out a bit during the fall semester, but so far I don't see when that would be happening... Btw, I cc'd the list on my reply. Hope you don't mind. Please keep them in the loop (for archival purposes) unless you're discussing something confidential. > I've attached your SparseSolve.py examples tweaked to deal with two > pickled numpy arrays (1D and 2D) in order to try out pycuda's > conjugate gradient (cg) function. > > I'm typically building sparse matrices and doing iterative cg calls as > part of a numerical model for mountain glaciation. I was hoping to > speed up the cg function within scipy by sending the task to my gpu. > However, what is clear is that much time is spent assembling the > packets (your PacketedSpMV() function) before execution of > solve_pkt_with_cg(). > > I need to execute cg for each time step of my model (typically 1-1.yr > steps for 10,000 yr integration) and this is the part of the model > where most time is spent. Any speed up here would be ideal. > > However, the performance is about 20 times slower than if run on a > single cpu using scipy's cg function. I knew there would be some > overhead for reading/writing to the GPU, but I wasn't expecting this > much time in packet assembly. Am I wasting my time trying to do this > on a GPU? I apologize in advance for my deficit in GPU/parallel > coding! Does the sparsity structure of the matrix change? If not, you could simply scatter the new entries into the existing data structure, which would be pretty fast (but would still require a little additional code on top of what's there). If your structure *does* change and can't be predicted/generalized over somehow, then the present code is simply not for you. It spends a significant amount of (CPU!) time building, partitioning and transferring, under the assumption that this only happens during preprocessing. The actual CG and matrix-vector products are tuned to be fast. If you'd want to accommodate changing sparsity patterns, you'd have to GPU-ify assembly, but I don't think even cusp [1] does that. Andreas [1] http://code.google.com/p/cusp-library/ ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
Bryan, Note that in 3.2, all files in cuda/lib are UB, including cudart (finally!). I build fat-binary on OS-X (up to date 10.6.4, cuda 3.2, etc.), simply by setting the usual arch flags in siteconf.py: CXXFLAGS = ["-arch", "x86_64", "-arch", "i386"] LDFLAGS = ["-arch", "x86_64", "-arch", "i386"] The build succeeds just fine, but most functions don't actually work, and I'm not well versed enough in the details to figure it out. It seems that all the memory/mempool related tests pass, but the rest don't. Perhaps because something should be passed differently to nvcc when running 64-bit, as opposed to 32? -MinRK On Mon, Sep 20, 2010 at 08:34, Bryan Catanzaro wrote: > I think it should be changed to check to see if the Python interpreter is > currently running in 32 bit mode, and then compile to match: > > if 'darwin' in sys.platform and sys.maxint == 2147483647: ># The Python interpreter is running in 32 bit mode on OS X > if "-arch" not in conf["CXXFLAGS"]: > conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >if "-arch" not in conf["LDFLAGS"]: > conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) > > Some people (myself included) have to run Python in 32-bit mode on 64-bit > OS X for various compatibility reasons (currently including libcudart.dylib, > which is only shipped as a 32-bit library). Since the Python which Apple > ships is compiled as a fat binary with both 32 and 64 bit versions, we can't > know a priori what the right compiler flags are. > > - bryan > > > > On Sep 19, 2010, at 5:55 PM, Andreas Kloeckner wrote: > > > On Thu, 16 Sep 2010 14:37:31 -0400, gerald wrong < > psillymathh...@gmail.com> wrote: > >> Looking at PyCUDA setup.py, I found this: > >> > >>if 'darwin' in sys.platform: > >># prevent from building ppc since cuda on OS X is not compiled > for > >> ppc > >># also, default to 32-bit build, since there doesn't appear to be > a > >># 64-bit CUDA on Mac yet. > >>if "-arch" not in conf["CXXFLAGS"]: > >>conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) > >>if "-arch" not in conf["LDFLAGS"]: > >>conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) > >> > >> Since 64bit CUDA support on Mac OS X is a reality since at least > CUDA3.1, > >> this should be changed. I know there are other problems regarding 32/64 > >> python stopping 64bit mac users at the moment (including myself), but > since > >> changes are being made to the PyCUDA code to make it compatible with > CUDA3.2 > >> anyway... > > > > It looks like this should just be killed wholesale--right? Or is there > > anything more appropriate that it should be changed to? > > > > Andreas > > > > ___ > > PyCUDA mailing list > > PyCUDA@tiker.net > > http://lists.tiker.net/listinfo/pycuda > > - bryan > > > > > ___ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > > ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
[PyCUDA] bug in __iadd__ and __isub__ in gpuarray
Hello all, Thanks for this great package which ease so much the use of cuda ! I think I found out a small bug in the current git version of the code. If you run the following code in ipython : >>> from pycuda import autoinit >>> from pycuda import driver, compiler, gpuarray, tools >>> a = gpuarray.ones(16, dtype=float32) >>> a += 1 you get : --- AttributeErrorTraceback (most recent call last) /home/data/projets/pycuda_mult/ in () /usr/lib/python2.6/site-packages/pycuda-0.94rc-py2.6-linux-x86_64.egg/pycuda/gpuarray.py in __iadd__(self, other) 258 259 def __iadd__(self, other): --> 260 return self._axpbyz(1, other, 1, self) 261 262 def __isub__(self, other): /usr/lib/python2.6/site-packages/pycuda-0.94rc-py2.6-linux-x86_64.egg/pycuda/gpuarray.py in _axpbyz(self, selffac, other, otherfac, out, add_timer, stream) 140 """Compute ``out = selffac * self + otherfac*other``, 141 where `other` is a vector..""" --> 142 assert self.shape == other.shape 143 144 func = elementwise.get_axpbyz_kernel(self.dtype, other.dtype, out.dtype) AttributeError: 'int' object has no attribute 'shape' The following patch should solve this issue. It adds also a ones function as it exist in numpy and define __sub__ using __add__ to avoid code duplication. Tell me what you think about those changes. Also I do not know if patches should be submitted to the mailing list but could not find another way. Cheers, Nicolas path follows: diff --git a/pycuda/gpuarray.py b/pycuda/gpuarray.py index a0a1da1..b685e52 100644 --- a/pycuda/gpuarray.py +++ b/pycuda/gpuarray.py @@ -235,17 +235,7 @@ class GPUArray(object): def __sub__(self, other): """Substract an array from an array or a scalar from an array.""" - -if isinstance(other, GPUArray): -result = self._new_like_me(_get_common_dtype(self, other)) -return self._axpbyz(1, other, -1, result) -else: -if other == 0: -return self -else: -# create a new array for the result -result = self._new_like_me() -return self._axpbz(1, -other, result) +return self.__add__(-other) def __rsub__(self,other): """Substracts an array by a scalar or an array:: @@ -257,10 +247,18 @@ class GPUArray(object): return self._axpbz(-1, other, result) def __iadd__(self, other): -return self._axpbyz(1, other, 1, self) +if isinstance(other, GPUArray): +# add another vector +return self._axpbyz(1, other, 1, self) +else: +# add a scalar +if other == 0: +return self +else: +return self._axpbz(1, other, self) def __isub__(self, other): -return self._axpbyz(1, other, -1, self) +return self.__iadd__(-other) def __neg__(self): result = self._new_like_me() @@ -631,6 +629,13 @@ def zeros(shape, dtype, allocator=drv.mem_alloc): result.fill(0) return result +def ones(shape, dtype, allocator=drv.mem_alloc): +"""Returns an array of the given shape and dtype filled with 1's.""" + +result = GPUArray(shape, dtype, allocator) +result.fill(1) +return result + def empty_like(other_ary): result = GPUArray( other_ary.shape, other_ary.dtype, other_ary.allocator) ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
I think it should be changed to check to see if the Python interpreter is currently running in 32 bit mode, and then compile to match: if 'darwin' in sys.platform and sys.maxint == 2147483647: # The Python interpreter is running in 32 bit mode on OS X if "-arch" not in conf["CXXFLAGS"]: conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) if "-arch" not in conf["LDFLAGS"]: conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) Some people (myself included) have to run Python in 32-bit mode on 64-bit OS X for various compatibility reasons (currently including libcudart.dylib, which is only shipped as a 32-bit library). Since the Python which Apple ships is compiled as a fat binary with both 32 and 64 bit versions, we can't know a priori what the right compiler flags are. - bryan On Sep 19, 2010, at 5:55 PM, Andreas Kloeckner wrote: > On Thu, 16 Sep 2010 14:37:31 -0400, gerald wrong > wrote: >> Looking at PyCUDA setup.py, I found this: >> >>if 'darwin' in sys.platform: >># prevent from building ppc since cuda on OS X is not compiled for >> ppc >># also, default to 32-bit build, since there doesn't appear to be a >># 64-bit CUDA on Mac yet. >>if "-arch" not in conf["CXXFLAGS"]: >>conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >>if "-arch" not in conf["LDFLAGS"]: >>conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) >> >> Since 64bit CUDA support on Mac OS X is a reality since at least CUDA3.1, >> this should be changed. I know there are other problems regarding 32/64 >> python stopping 64bit mac users at the moment (including myself), but since >> changes are being made to the PyCUDA code to make it compatible with CUDA3.2 >> anyway... > > It looks like this should just be killed wholesale--right? Or is there > anything more appropriate that it should be changed to? > > Andreas > > ___ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda - bryan smime.p7s Description: S/MIME cryptographic signature ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda
Re: [PyCUDA] mac64 PyCUDA
Of course, the most recent Mac Pros now ship with ATI. http://www.apple.com/macpro/features/graphics.html On Sun, Sep 19, 2010 at 9:24 PM, gerald wrong wrote: > I think PyCUDA should attempt to build correctly as 64bit so once the > python64 bugs are shaken out, PyCUDA will be able to build on Mac OS X > without special flags etc. > My understanding from multiple build attempts is that there is a bug in the > various Mac 64bit python implementations causing them to run in 32bit even > when forced 64; this should eventually get fixed (if it has not already). I > admit after many days of frustration, I have not tried to build on mac64 in > about 2 months; perhaps I will give it another go this week and report here > to the list. > The ability to devel on Macs is very useful, since newer macs include CUDA > capable hardware... and there are lots of scientists with macs... Judging by > the old queries on the mailing list, there are many people who have spent > time fiddling with PyCUDA on mac64... so there must be significant interest > in this feature. > On Sun, Sep 19, 2010 at 8:55 PM, Andreas Kloeckner > wrote: >> >> On Thu, 16 Sep 2010 14:37:31 -0400, gerald wrong >> wrote: >> > Looking at PyCUDA setup.py, I found this: >> > >> > if 'darwin' in sys.platform: >> > # prevent from building ppc since cuda on OS X is not compiled >> > for >> > ppc >> > # also, default to 32-bit build, since there doesn't appear to >> > be a >> > # 64-bit CUDA on Mac yet. >> > if "-arch" not in conf["CXXFLAGS"]: >> > conf["CXXFLAGS"].extend(['-arch', 'i386', '-m32']) >> > if "-arch" not in conf["LDFLAGS"]: >> > conf["LDFLAGS"].extend(['-arch', 'i386', '-m32']) >> > >> > Since 64bit CUDA support on Mac OS X is a reality since at least >> > CUDA3.1, >> > this should be changed. I know there are other problems regarding 32/64 >> > python stopping 64bit mac users at the moment (including myself), but >> > since >> > changes are being made to the PyCUDA code to make it compatible with >> > CUDA3.2 >> > anyway... >> >> It looks like this should just be killed wholesale--right? Or is there >> anything more appropriate that it should be changed to? >> >> Andreas >> > > > ___ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > > ___ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda