Yuan Chen <[email protected]> writes:

> Hi,
>
> I just start to use pycuda to do some gpu computing.
>
> However, I found that transfering numpy arrays to gpu costs a lot of time
> and so does compiling the source.
>
> I am using the SourceModule now and as far as I know, for example, I have a
> file called try.py and a kernel function called searching(float *arr), the
> question is
>
> 1) Everytime I run the  try.py, the searching function is compiled once,
> and cached later until the codes end. So I am wondering if I can
> perminantly save that function and load the saved function so that I don't
> have to compile it when I run the script.

PyCUDA caches the binaries for your source code as much as possible. So
once you compile the same code a second time, SourceModule construction
should be quite fast. Are you finding otherwise?

> 2) Is there a way that make transfering data faster? I read the documents,
> is the managed memory gonna help with this?

Read about page-locked host memory. Those transfers are a fair bit
faster than non-page-locked ones, since the hardware can do them on its
own.

Andreas

_______________________________________________
PyCUDA mailing list
[email protected]
https://lists.tiker.net/listinfo/pycuda

Reply via email to