threeleafzerg commented on issue #10696: [MXNET-366]Extend MXNet Distributed 
Training by AllReduce
URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-389170346
 
 
   @eric-haibin-lin  
   Hi haibin, I have already finished code modification according to your 
comments. Any question, please let me know. Thanks!
   Note: Upon our internal code review (intel), I changed the following naming:
   dist_sync_mpi -> dist_sync_allreduce
   mpi_collectives -> collectives
   MPI_Wrapper -> COLL_Wrapper
   Because the collectives can be implemented not only in MPI library. (e.g. 
nccl library) 
   The corresponding design doc has already been updated. 
   
https://docs.google.com/document/d/1e4anwDiS18cWP49FAghU6tqqdtnRKUcbNJJxvhIfvIA/edit#heading=h.t762l56r1094
   Gluon is also supported.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to