sandeep-krishnamurthy commented on issue #12713: distributed kvstore bug in 
MXNet 
URL: 
https://github.com/apache/incubator-mxnet/issues/12713#issuecomment-435773777
 
 
   > * Initializing `trainer = gluon.Trainer(update_on_kvstore=True)` doesn't 
work. Inspecting `trainer._update_on_kvstore` shows that the value is still set 
to `False`.
   
   This is fixed.
   
   > * When distributed kvstore is used, by default `gluon.Trainer` doesn't 
work with `mx.optimizer.LRScheduler` if a worker has more than 1 GPU. To be 
more specific, the trainer updates once per GPU, the `LRScheduler` object is 
shared across GPUs and get a wrong update count.
   
   This needs to be fixed.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to