Hi Lisandro,
On 10/21/2014 08:25 PM, Lisandro Dalcin wrote:
On 20 October 2014 08:06, Karl Rupp <[email protected]> wrote:
I pushed a function for obtaining the CUDA pointer from a CUSP vector here:
https://bitbucket.org/petsc/petsc/commits/d831094ec27070ea54a249045841367f8aab0976
Karl, I think this is half-way to meed our needs, and the missing bits
are related to out-of-sync CPU/GPU buffers. To improve the
implementation and make it useful for petsc4py (and other plain-C
consumers), I would suggest the following:
1) Implement VecCUSPGetCUDAArray() by calling
VecCUSPGetArrayReadWrite(), this automatically handles calling
VecCUSPCopyToGPU()
2) I think we still need a VecCUSPRestoreCUDAArray(), you can
implement it by basically calling
VecCUSPRestoreArrayReadWrite(vec,NULL) to update the valid_GPU_array
flag and the internal object state.
okay, so also want to have the guarantee that the the buffer contains
the latest data, possibly sync'd from main memory. We might have to
provide an additional interface for directly obtaining the CUDA handle
without triggering a potentially costly host-device transfer then.
Either way, I can add the requested functionality. However, please allow
for two days, I'm on a long-distance flight tomorrow.
Best regards,
Karli