Sven Schreiber schrieb: > Andrew Straw schrieb: >> Sven Schreiber wrote:
>>>> >>> Yes, absolutely! So really my question was meant as: >>> >>> Ubuntu 9.10 and Nvidia SDK howto? >>> >> The examples are working for me on Karmic with the attached siteconf.py, >> but I haven't gone any further. I'm using amd64 arch, the 195.30 beta >> drivers, and a GeForce GTX 260. >> > > Thanks, I will keep the possibility in mind to upgrade to the beta > drivers. However, searching for "nvidia sdk gcc 4.4" I found some > instructions how to get the cuda sdk up and running on ubuntu 9.10. I'll > try these soon and probably report back here to leave some hints for > future readers with the same problem. > Ok, so I have pyopencl-0.91.4 as well as pycuda-0.93 now up and running here. The combination is (short version): * Driver 195.30 beta * gcc symlink pointing to gcc-4.4 for compiling the driver kernel module, but pointing to gcc-4.3 for the rest * cudatoolkit_2.3_linux_32_ubuntu9.04.run * gpucomputingsdk_2.3b_linux.run * (for the Cuda stuff, following the advice in http://moelhave.dk/2009/12/nvidia-cuda-on-ubuntu-karmic-koala/; and for Nvidia's OpenCL examples I also changed the CXX, CC, and LINK lines in OpenCL/common/common_opencl.mk) I had problems with the 190.29 drivers, and while pycuda worked with the 190.53 drivers, (py)opencl didn't -- I guess the latter is expected. So for me indeed only the 195.30 beta drivers seem to work with both. BTW, a remark about the benchmark-all.py example file. I think the speed comparison there is a little biased in favor of pyopencl. It compares (almost) pure Python with pyopencl, but IMHO the more meaningful comparison would be between Numpy vectorized code and pyopencl. AFAICS the numpy equivalent of the pure Python code would be: for j in range(1000): # number of iterations, just for comparability n_result = (a+b)**2 * (a/2.0) At least the results seem to agree when checked afterwards. On my test system I get the following timings: * pure Python: 20.85s * vectorized Numpy on CPU: 0.044s * pyopencl on GPU: 0.034s Of course I'm *not* saying that the pyopencl approach isn't fast and useful. (My test graphics card is very low end and is on the slow PCI bus.) But the first one or two orders of magnitude can be achieved already without any GPU magic. thank you for these very cool tools, sven _______________________________________________ PyOpenCL mailing list [email protected] http://host304.hostmonster.com/mailman/listinfo/pyopencl_tiker.net
