Re: [PR] [DeviceAPI] Support "GetCurrentStream" [tvm]

via GitHub Wed, 13 Mar 2024 14:15:38 -0700


Lunderberg commented on PR #16689:
URL: https://github.com/apache/tvm/pull/16689#issuecomment-1995847276


   > This PR also changes the `GPUCopy` of CUDA device API to always using 
`cudaMemcpyAsync`.
   
   I think this portion of the commit needs to be reverted.  Prior to this 
commit, the 
[`NDArray::CopyTo`](https://github.com/apache/tvm/blob/main/include/tvm/runtime/ndarray.h#L116)
 function could be called to transfer an array to/from the GPU and return the 
transferred array.   After this commit, there is no synchronization point after 
the `cudaMemcpyAsync`, before returning control to the caller of 
`NDArray::CopyTo`.
   
   * The caller may read from the `NDArray` result immediately after it 
completes.  After this commit, this is a read from uninitialized memory.
   * The caller may free the backing allocation of the `NDArray` argument 
immediate after `NDArray::CopyTo` completes.  After this commit, this causes 
CUDA to read from a dangling pointer.
   
   This function is used in many locations which relied on the previous 
semantics.
   
   * The 
[`""vm.builtin.to_device"`](https://github.com/apache/tvm/blob/main/src/runtime/relax_vm/builtin.cc#L418)
 PackedFunc, which is the lowered form of `R.to_device`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [DeviceAPI] Support "GetCurrentStream" [tvm]

Reply via email to