ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. Due to ABI incompatibility between different protobuf versions--preinstalled version that comes with Deep Learning AMI (Ubuntu apt), preinstalled version that comes with Anaconda and version that gets auto-installed by the Makefile--you need to uninstall the 2 versions that come with apt and Anaconda.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
