threeleafzerg commented on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-389170346 @eric-haibin-lin Hi haibin, I have already finished code modification according to your comments. Any question, please let me know. Thanks! Note: Upon our internal code review (intel), I changed the following naming: dist_sync_mpi -> dist_sync_allreduce mpi_collectives -> collectives MPI_Wrapper -> COLL_Wrapper Because the collectives can be implemented not only in MPI library. (e.g. nccl library) The corresponding design doc has already been updated. https://docs.google.com/document/d/1e4anwDiS18cWP49FAghU6tqqdtnRKUcbNJJxvhIfvIA/edit#heading=h.t762l56r1094 Gluon is also supported.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
