MasterJH5574 opened a new pull request, #16689: URL: https://github.com/apache/tvm/pull/16689
This PR introduces a new function `GetCurrentStream`to device API, which returns the current stream of the given device. Meanwhile, this PR updates the "CreateStream" of CUDA to creating a non-blocking stream, so that the execution on this stream can overlap with the execution of other streams. This PR also changes the `GPUCopy` of CUDA device API to always using `cudaMemcpyAsync`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
