Actually, the current design structure is very like kvstore_nccl as attached picture shown.
I have updated the proposal into google doc as well. It’s more easy to add comments and modify. https://docs.google.com/document/d/1e4anwDiS18cWP49FAghU6tqqdtnRKUcbNJJxvhIfvIA/edit#heading=h.t762l56r1094 Thanks, --Patric From: Ye, Zhouhai Sent: Tuesday, March 27, 2018 4:30 PM To: 'Nan Zhu' <zhunanmcg...@gmail.com>; 'dev@mxnet.incubator.apache.org' <dev@mxnet.incubator.apache.org> Cc: 'Li, Mu' <m...@amazon.com>; Lv, Tao A <tao.a...@intel.com>; Ma, Guokai <guokai...@intel.com>; 'Rahul Huilgol' <rahulhuil...@gmail.com>; Ye, Jason Y <jason.y...@intel.com>; Zhang, Rong A <rong.a.zh...@intel.com>; Zhao, Patric <patric.z...@intel.com> Subject: RE: Extend MXNET distributed training with MPI AllReduce For our current POC: b. Add mpi.kvstore in python. It depends upon mxnet submodule mpi_collectives (new). (mpi_collectives is c++ library depending upon mxnet.) (Add new type of kvstore in python layer.) mpi_collectives doesn’t need to be a single c++ library. It’s source code can be compiled into libmxnet.so. From: Ye, Zhouhai Sent: Tuesday, March 27, 2018 11:21 AM To: Nan Zhu <zhunanmcg...@gmail.com<mailto:zhunanmcg...@gmail.com>>; dev@mxnet.incubator.apache.org<mailto:dev@mxnet.incubator.apache.org> Cc: Li, Mu <m...@amazon.com<mailto:m...@amazon.com>>; Lv, Tao A <tao.a...@intel.com<mailto:tao.a...@intel.com>>; Ma, Guokai <guokai...@intel.com<mailto:guokai...@intel.com>>; Rahul Huilgol <rahulhuil...@gmail.com<mailto:rahulhuil...@gmail.com>>; Ye, Jason Y <jason.y...@intel.com<mailto:jason.y...@intel.com>>; Zhang, Rong A <rong.a.zh...@intel.com<mailto:rong.a.zh...@intel.com>>; Zhao, Patric <patric.z...@intel.com<mailto:patric.z...@intel.com>> Subject: RE: Extend MXNET distributed training with MPI AllReduce You can check mpi.kvstore API Spec in our design doc: e.g. We add pushpull and broadcast interface and disable original push and pull in new kvstore. From: Ye, Zhouhai Sent: Tuesday, March 27, 2018 11:18 AM To: 'Nan Zhu' <zhunanmcg...@gmail.com<mailto:zhunanmcg...@gmail.com>>; dev@mxnet.incubator.apache.org<mailto:dev@mxnet.incubator.apache.org> Cc: Li, Mu <m...@amazon.com<mailto:m...@amazon.com>>; Lv, Tao A <tao.a...@intel.com<mailto:tao.a...@intel.com>>; Ma, Guokai <guokai...@intel.com<mailto:guokai...@intel.com>>; Rahul Huilgol <rahulhuil...@gmail.com<mailto:rahulhuil...@gmail.com>>; Ye, Jason Y <jason.y...@intel.com<mailto:jason.y...@intel.com>>; Zhang, Rong A <rong.a.zh...@intel.com<mailto:rong.a.zh...@intel.com>>; Zhao, Patric <patric.z...@intel.com<mailto:patric.z...@intel.com>> Subject: RE: Extend MXNET distributed training with MPI AllReduce Hi, Nan Zhu As we described in our design doc, there’s two possible code structure (implementation) : (currently we implement second in our POC) a. Implement mpi.kvstore same level as the current kvstores (CPP src/kvstore) (Adhere to original kvstore factory pattern) b. Add mpi.kvstore in python. It depends upon mxnet submodule mpi_collectives (new). (mpi_collectives is c++ library depending upon mxnet.) (Add new type of kvstore in python layer.) For your second question, I think to make a single communication submodule is OK (just like a.). But an unified abstraction for both PS and Allreduce is very hard. From: Nan Zhu [mailto:zhunanmcg...@gmail.com] Sent: Tuesday, March 27, 2018 10:39 AM To: dev@mxnet.incubator.apache.org<mailto:dev@mxnet.incubator.apache.org> Cc: Li, Mu <m...@amazon.com<mailto:m...@amazon.com>>; Lv, Tao A <tao.a...@intel.com<mailto:tao.a...@intel.com>>; Ma, Guokai <guokai...@intel.com<mailto:guokai...@intel.com>>; Rahul Huilgol <rahulhuil...@gmail.com<mailto:rahulhuil...@gmail.com>>; Ye, Jason Y <jason.y...@intel.com<mailto:jason.y...@intel.com>>; Ye, Zhouhai <zhouhai...@intel.com<mailto:zhouhai...@intel.com>>; Zhang, Rong A <rong.a.zh...@intel.com<mailto:rong.a.zh...@intel.com>>; Zhao, Patric <patric.z...@intel.com<mailto:patric.z...@intel.com>> Subject: Re: Extend MXNET distributed training with MPI AllReduce Hi, Patric It's pretty nice work! A question: how the future code structure would look like when putting this allreduce module as an submodule? We will have two communication submodules? Is there any plan to give an unified abstraction for communication so that a single communication submodule is possible? Best, Nan On Mon, Mar 26, 2018 at 7:20 PM, Chris Olivier <cjolivie...@gmail.com<mailto:cjolivie...@gmail.com>> wrote: great! nice work! On Mon, Mar 26, 2018 at 6:31 PM Zhao, Patric <patric.z...@intel.com<mailto:patric.z...@intel.com>> wrote: > Hi MXNET owners/developers, > > As you known, the AllReduce and Parameter Severs are two very popular > distributed training modes in DL. > > Currently, MXNET only supports parameter server mode and is lack of > AllReduce mode. Other frameworks, like tensorflow, pytorch, caffe, etc, can > work with AllReduce. > Based on our analysis and experiments, AllReduce mode can achieves the > better scalability and more efficiency > > So, we propose to extend MXNET distributed training with MPI AllReduce > mode. > We have implemented a AllReduce prototype in MXNET and the results are > very positive. > AllReduce mode can get 94.7% scale efficiency by 8 compute nodes for VGG16 > while the Parameter Server requires totally 16 nodes (8 compute nodes + 8 > parameter severs) to reach 93.2%. > > The whole proposal is available in MXNET wiki. Any feedback are highly > appreciated. > > https://cwiki.apache.org/confluence/display/MXNET/Extend+MXNet+Distributed+Training+by+MPI+AllReduce > > Thanks in advance. > > BR, > > --Patric > >