Hi Marmaduke,

marmaduke woodman <[email protected]> writes:
> Does anyone have any experience or tips on distributing an application 
> using PyCUDA for users/computers that have a suitable GPU & driver but 
> otherwise unprepared for PyCUDA, i.e. not the full C++ compiler + CUDA 
> SDK toolchain?
>
> Otherwise, I suspect the context cache is a place to start: I would 
> compile all possible kernels, persist the cache and at runtime load the 
> cache so that no compilation is necessary? Any information there upon 
> would be welcome.

I personally don't have such packaging experience, but I'd be very
interesting in hearing yours.

Here are some thoughts on this: While making the code cache take care of
this seems reasonable at first, I'd probably suggest adding a secondary
cache-like layer with somewhat more positive control over this to
SourceModule. Specifically, it seems reasonable to add an extra
parameter for a kernel identifier, along with (likely) a global variable
that sets where kernels are stored. Then, SourceModule could run in one
of two modes: First, "Generation", where the set directory would be
populated with CUBINs for all known/supported values of the compute
capability/shader model. Second, "retrieval", where it is forced to use
CUBINs from that directory. I imagine that this would be more robust, as
lots of things enter into cache key generation (Python versions,
Compiler versions, headers) that could easily lead to unintended cache
misses.

HTH,
Andreas

Attachment: pgpqxHrKlnulv.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to