jmorrill opened a new pull request #4985: Implemented kDLCPUPinned 
(cudaMallocHost)
URL: https://github.com/apache/incubator-tvm/pull/4985
 
 
   Data allocated via cudaMallocHost is supposed to be faster at transferring 
data to/from a cuda device, and it was not implemented in the tvm runtime.
   
   The DeviceAPIs treat DLDeviceTypes as their own device, so kDLCPUPinned felt 
a little bit out of place because it was sort of a kDLCPU (host memory) but 
really it was owned by a kDLGPU (cuda api).
   
   I felt the least complicated path was to register an alias for 
"device_api.gpu" as "device_api.cpu_pinned" and implement the kDLCPUPinned 
logic in CUDADeviceAPI.
   
   Some small checks also needed to be modified.  Not sure if I missed any.
   
   Open to suggestions if my implementation is way off.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to