Dear Bob, bob zigon <[email protected]> writes: > If a kernel is called from within a python loop, how frequently is nvcc > called? > If the kernel is essentially static, I would hope that nvcc is called once > irregardless of > the number of times the loop iterates. > > On the other hand, if the kernel is templated, and the template is a function > of the loop > counter, it seems to me that nvcc would need to be called on every iteration.
Creating a SourceModule is somewhat expensive, and it's definitely something that you should avoid doing in the inner loop of your application. Just hold on to the module handle. PyCUDA tries to be smart about not recompiling when not necessary, but even in the no-recompile case, it has to look up the kernel in its cache on disk and check that no include files have changed. Hence my advice above. If you look at the PyCUDA GPUArray, it caches *readily instantiated* SourceModules by way of pycuda.tools.context_dependent.memoize. That way, it only incurs the instantiation penalty for each genuinely new kernel. HTH, Andreas _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
