[GitHub] ptrendx opened a new pull request #13764: Less cudaGet/SetDevice calls in Gluon execution

GitBox Wed, 02 Jan 2019 14:58:07 -0800

ptrendx opened a new pull request #13764: Less cudaGet/SetDevice calls in Gluon 
execution
URL: https://github.com/apache/incubator-mxnet/pull/13764
 
 
   ## Description ##
   This PR reduces the number of cudaGetDevice/cudaSetDevice calls during Gluon 
execution. 
   Previously, during every call to allocate/free buffer in StorageManager 
DeviceStore would call cudaGetDevice and 2x cudaSetDevice (to get the current 
device, set the new device and lastly to set the original device again), even 
if no actual allocation took place (due to caching allocator usage).
   
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
    - This PR changes the DeviceStore so that cudaSetDevice calls are made only 
when necessary (when the new device is different than the original device)
    - This PR changes the Storage so that only the actual GPU and pinned CPU 
allocations are guarded by the DeviceStore - memory returned from allocator 
cache does not need to be guarded.
   
   ## Comments ##
   - There is 1 more place that introduces potentially needless calls to 
cudaSetDevice when using Gluon - 
https://github.com/apache/incubator-mxnet/blob/master/src/engine/threaded_engine_perdevice.cc#L99
 is called when temporary ThreadedOpr object is destroyed after either normal 
or bulked execution of ops. It seems though that this should be handled by 
caching of ThreadedOpr objects and engine variables (since destruction of them 
after every op seems to be quite costly). @eric-haibin-lin FYI


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] ptrendx opened a new pull request #13764: Less cudaGet/SetDevice calls in Gluon execution

Reply via email to