samie abdul <[email protected]> writes: > Hi, > > is it possible to "precompile" the invoked kernels beforehand? My code makes > use of several CUDA kernels, which are basically called within a "fit" > function. Profiling the code with cProfile yields: > > 42272 function calls (42228 primitive calls) in 1.662 seconds > ... > > 11 0.000 0.000 0.344 0.031 compiler.py:185(compile) > 11 0.002 0.000 0.346 0.031 compiler.py:245(__init__) > 4 0.000 0.000 0.317 0.079 compiler.py:33(preprocess_source) > 11 0.000 0.000 0.342 0.031 compiler.py:66(compile_plain) > ... > > Thus, about 0.344 of the 1.662 seconds are spent on compiling the > code. When executing the function "fit" twice, the code is not > compiled again (hence, saving these 0.344 seconds for the second call > of "fit"). I would like to somehow precompile all involved kernels as > soon as the object the "fit" function belongs to is initialized... > > > Can one invoke the overall compilation process beforehand?
Sure! That's what the SourceModule constructor does. Just keep the instance around. Andreas
signature.asc
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
