Hi Rajan, This PR from the Intel folks is adding support for MPI based distributed training. They also needed proto3 and have updated the current ps-lite proto file to work with protobuf3.5. You might want to take a look at that and align efforts with that approach.
https://github.com/apache/incubator-mxnet/pull/10696 The ps-lite change: https://github.com/threeleafzerg/ps-lite/compare/a6dda54604a07d1fb21b016ed1e3f4246b08222a...a470d2270d4af4badf4c94eab9559811697332e3#diff-ba121c714260f51ca98d51a080880b6d Regards, Rahul On Wed, 23 May 2018 at 11:06 Singh, Rajan <[email protected]> wrote: > Hi, > > Currently, MXNet has Protobuf ( version 2.5) as one of its dependency. The > dependency comes from PS-lite< > https://github.com/dmlc/ps-lite/blob/a6dda54604a07d1fb21b016ed1e3f4246b08222a/make/deps.mk#L11> > used for distributed training. > Recently, we have added ONNX support in MXNet(1.2.0) contrib package( > import ONNX support). This module has a runtime dependency on > Protobuf(version 3) , needed for ONNX. > So, if a user tries to do “import onnx”, will get a message: > > “To use this module developers need to install ONNX, which requires the > protobuf compiler to be installed separately. Please follow the > instructions to install ONNX and its dependencies< > https://github.com/onnx/onnx#installation>. MXNet currently supports ONNX > v1.1.1. Once installed, you can go through the tutorials on how to use this > module.” > > User will end up installing protobuf version 3.5.2. Since Protobuf > backward compatibility is flaky, anything dependent on version < 2.6, will > probably break. In this case, distributed training might break for the user. > > IMO, To resolve this dependency conflict in MXNet, would require an update > to PS-lite dependency to Protobuf version 3. Is there a POA to update this > dependency for PS-lite? > FYI: We are also working on adding an export module support, will export > MXNet models to ONNX format, which will also have Protobuf version 3 and > ONNX as its runtime dependency. > > Please let me know, what should be best path moving forward. > > Thanks > Rajan > >
