yuewu001 opened a new issue #9557: update_on_kvstore error setting with 
multiple machines
URL: https://github.com/apache/incubator-mxnet/issues/9557
 
 
   When I was training with multiple machines, i found that 
[model.py:_create_kvstore 
](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/model.py)function
 sets update_on_kvstore to True. In the gluon interface (trainer.py), i found 
the following code: 
   ```python
    if 'dist' in kvstore.type:
       update_on_kvstore = False
       for i, param in enumerate(self._params):
           param_arrays = param.list_data()
           kvstore.init(i, param_arrays[0])
           kvstore.pull(i, param_arrays, priority=-i)
   ```
   while in module.py, update_on_kvstore is not set to False. 
   
   Is this a bug?
   
   Besides, the gluon interfaces pull all param_arrarys whatever 
update_on_kvstore is. But in the python interface (model.py), only when 
update_on_kvstore is True, the params are pulled.  Any reasons?
   
   ```python
   def _initialize_kvstore(kvstore, param_arrays, arg_params, param_names, 
update_on_kvstore):
       """Initialize kvstore"""
       for idx, param_on_devs in enumerate(param_arrays):
           name = param_names[idx]
           kvstore.init(name,  #arg_params[name])
   
           if update_on_kvstore:
               kvstore.pull(name, param_on_devs, priority=-idx)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to