Hi,
>> Perhaps one can do cudaGetDevice() and cudaDeviceReset() in between
the two
calls to PetscInitialize in the application code?
Provided no device data was allocated in between the two calls, this might
eliminate the error.
Sure, but how will we actually share the device between libraries? What
if the other library was not PETSc, but something else, and they also
called cudaSetDevice, but with a different default mapping strategy?
We need an interface that handles this case.
Yes, I encountered similar user requests in ViennaCL.
What is a bit tricky is the question of *when* to call cudaSetDevice()
then. We require users to call PetscInitialize() before anything else,
so cudaSetDevice() needs to be called from some other place. I like the
lazy instantiation model, where the GPU backends (OpenCL, CUDA) are
initialized only when the first object (e.g. a Vec) is created. This
should provide enough room for all customizations of the CUDA
initizalization between PetscInitialize() and VecCreate(). I think this
can be implemented fairly quickly - I'll do it unless there are any
objections later today.
Best regards,
Karli