yuxihu opened a new pull request #14431: Add post initialization callback for 
Gluon Parameter
URL: https://github.com/apache/incubator-mxnet/pull/14431
 
 
   In Gluon, we defer initialization for some parameters until their shapes are 
known after the first forward pass. This makes it difficult for us to sync 
parameters among workers when we conduct distributed training using MXNet + 
Horovod with different random seeds for workers. More details can be found in 
this [Horovod issue](https://github.com/horovod/horovod/issues/895). 
   
   In this PR, we add a post_init_callback to Gluon Parameter which will be 
called after the parameter is initialized. With this callback, parameters can 
be synced among workers once they are initialized. The typical usage of this 
callback in Horovod will be like this:
   ```
   random.seed(hvd.local_rank())
   data = ..
   model = mx.gluon.nn.Dense(10)
   model.initialize(post_init_callback=hvd.broadcast_parameters)
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to