Absent a while and so many things happen here :-) >> test_cumath.py passes all tests but is 5 times slower in the GTX280 and >> 40(!) times slower in the GTX480 >This might just be due to the G80/G92 compilers being faster than the >280/480 ones. In general, the tests are not meant for benchmarking. Understood. But I mentioned the strong slowdowns (both on the GTX280 and the GTX480) as they seem an indicator of something wrong.
>I'm getting 30;30 on a C2050 with the 3.2rc drivers/toolkit. [...] >In the meantime, what driver and compiler versions are you running? >Have you tried upgrading to the latest and greatest? They are all x86_64 Ubuntu 10.4 boxes with gcc/g++ 4.4.3, libc6, Nvidia dev-drivers 256.40, CUDA 3.1, python 2.6.5 and the stable version of pycuda 0.94.1 downloaded 3 days ago from pypi.python.org. Probably two irrelevant facts but all the GPUs are driving VDUs and never noticed any errors or unexpected slowdowns from native CUDA codes. >Try to change it to arch = "sm_10" > and so on, and check whether you get incorrect 14 in such a case. Sorry, Rybak. Can't put it to work. Thanks anyway. Are there any other tests that I can do to help debug this? The sample I've posted can also trivially demonstrate the difficulties I'm having with the sum, min and max functions with arrays of 4 or more elements. Gpuarray seems an amazing tool. It would be great to have it run on the Fermis -- View this message in context: http://pycuda.2962900.n2.nabble.com/failed-test-gpuarray-on-GTX480-tp5574551p5583690.html Sent from the PyCuda mailing list archive at Nabble.com. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
