Dnia 2010-09-28, wto o godzinie 23:51 -0400, Andreas Kloeckner pisze: > On Tue, 28 Sep 2010 23:56:47 +0200, Tomasz Rybak <[email protected]> wrote: > > I have idea for (maybe) checking whether problem is with PyCUDA, > > CUDA toolkit, or driver. > > Can you force PyCUDA to generate not sm_20 code, but 1x? > > I have found that it is determined in line 190 of file > > pycuda/compiler.py: > > arch = "sm_%d%d" % Context.get_device().compute_capability() > > Try to change it to > > arch = "sm_10" > > and so on, and check whether you get incorrect 14 in such > > a case. > > > > If there is simpler way of changing architecture to which > > PyCUDA generates code, feel free to use it and share this > > information. > > arch can be overridden from the SourceModule arguments: > http://documen.tician.de/pycuda/driver.html#module-pycuda.compiler >
Yes, but code from this thread was calling GPUArray.dot, which was calling ReductionKernel, and in none of those I have seen ability to pass arch='sm_10' argument. I have checked and dot_ab_gpu = gpuarray.dot(a_gpu, b_gpu, arch='sm_11').get() gives error. -- Tomasz Rybak <[email protected]> GPG/PGP key ID: 2AD5 9860 Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860 http://member.acm.org/~tomaszrybak
signature.asc
Description: This is a digitally signed message part
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
