samie abdul <[email protected]> writes:

> Hi,
>
> is it possible to "precompile" the invoked kernels beforehand? My code makes 
> use of several CUDA kernels, which are basically called within a "fit" 
> function. Profiling the code with cProfile yields:
>
> 42272 function calls (42228 primitive calls) in 1.662 seconds
> ...
>
> 11    0.000    0.000    0.344    0.031 compiler.py:185(compile)
> 11    0.002    0.000    0.346    0.031 compiler.py:245(__init__)
> 4    0.000    0.000    0.317    0.079 compiler.py:33(preprocess_source)
> 11    0.000    0.000    0.342    0.031 compiler.py:66(compile_plain)
> ...
>
> Thus, about 0.344 of the 1.662 seconds are spent on compiling the
> code. When executing the function "fit" twice, the code is not
> compiled again (hence, saving these 0.344 seconds for the second call
> of "fit"). I would like to somehow precompile all involved kernels as
> soon as the object the "fit" function belongs to is initialized...
>
>
> Can one invoke the overall compilation process beforehand?

Sure! That's what the SourceModule constructor does. Just keep the
instance around.

Andreas

Attachment: signature.asc
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to