Yuan Chen <[email protected]> writes: > Hi, > > I just start to use pycuda to do some gpu computing. > > However, I found that transfering numpy arrays to gpu costs a lot of time > and so does compiling the source. > > I am using the SourceModule now and as far as I know, for example, I have a > file called try.py and a kernel function called searching(float *arr), the > question is > > 1) Everytime I run the try.py, the searching function is compiled once, > and cached later until the codes end. So I am wondering if I can > perminantly save that function and load the saved function so that I don't > have to compile it when I run the script.
PyCUDA caches the binaries for your source code as much as possible. So once you compile the same code a second time, SourceModule construction should be quite fast. Are you finding otherwise? > 2) Is there a way that make transfering data faster? I read the documents, > is the managed memory gonna help with this? Read about page-locked host memory. Those transfers are a fair bit faster than non-page-locked ones, since the hardware can do them on its own. Andreas _______________________________________________ PyCUDA mailing list [email protected] https://lists.tiker.net/listinfo/pycuda
