Hi Marmaduke, marmaduke woodman <[email protected]> writes: > Does anyone have any experience or tips on distributing an application > using PyCUDA for users/computers that have a suitable GPU & driver but > otherwise unprepared for PyCUDA, i.e. not the full C++ compiler + CUDA > SDK toolchain? > > Otherwise, I suspect the context cache is a place to start: I would > compile all possible kernels, persist the cache and at runtime load the > cache so that no compilation is necessary? Any information there upon > would be welcome.
I personally don't have such packaging experience, but I'd be very interesting in hearing yours. Here are some thoughts on this: While making the code cache take care of this seems reasonable at first, I'd probably suggest adding a secondary cache-like layer with somewhat more positive control over this to SourceModule. Specifically, it seems reasonable to add an extra parameter for a kernel identifier, along with (likely) a global variable that sets where kernels are stored. Then, SourceModule could run in one of two modes: First, "Generation", where the set directory would be populated with CUBINs for all known/supported values of the compute capability/shader model. Second, "retrieval", where it is forced to use CUBINs from that directory. I imagine that this would be more robust, as lots of things enter into cache key generation (Python versions, Compiler versions, headers) that could easily lead to unintended cache misses. HTH, Andreas
pgpqxHrKlnulv.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
