I do expect the API to change in the future. Currently @szhengac @zhongyuchen
and I are exploring APIs for gradient compression with a few algorithms, and we
may bring back the best practice back to MXNet.
--
You are receiving this because you are subscribed to this thread.
Reply to this
Would it make sense to add optional support for sparse ndarrays and gradient
compression in `AbstractKVStore`? You mentioned not all frameworks support it.
Do you expect the API to change in the future?
--
You are receiving this because you are subscribed to this thread.
Reply to this email
I did mean use case 2,3,4.
Initialization is done in the constructor `kv.__init__()`, and for horovod it
could be simply a `hvd.init()` call.
I have not discussed problem 1 for too much details. horovod uses mpirun to
setup connection and launch processes, while byteps/p3 and native kvstore
In the Limitation, I suppose you meant 'use case 1,3,4', right?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16795#issuecomment-553085374
## Background
Data parallel training is the most common distributed training technique when
it comes to multiple GPUs or multiple hosts. Currently, several communication
backends provide functionalities for communicating tensors across devices/hosts
for data parallel training. For MXNet