Hi,

>> Perhaps one can do cudaGetDevice() and cudaDeviceReset() in between the two
calls to PetscInitialize in the application code?
Provided no device data was allocated in between the two calls, this might
eliminate the error.

Sure, but how will we actually share the device between libraries?  What
if the other library was not PETSc, but something else, and they also
called cudaSetDevice, but with a different default mapping strategy?

We need an interface that handles this case.

Yes, I encountered similar user requests in ViennaCL.

What is a bit tricky is the question of *when* to call cudaSetDevice() then. We require users to call PetscInitialize() before anything else, so cudaSetDevice() needs to be called from some other place. I like the lazy instantiation model, where the GPU backends (OpenCL, CUDA) are initialized only when the first object (e.g. a Vec) is created. This should provide enough room for all customizations of the CUDA initizalization between PetscInitialize() and VecCreate(). I think this can be implemented fairly quickly - I'll do it unless there are any objections later today.

Best regards,
Karli

Reply via email to