[GitHub] [incubator-mxnet] liuzh91 commented on issue #16900: introduce gradient update handler to the base estimator

GitBox Tue, 26 Nov 2019 00:23:24 -0800

liuzh91 commented on issue #16900: introduce  gradient update handler to the  
base estimator
URL: https://github.com/apache/incubator-mxnet/pull/16900#issuecomment-558516425
 
 
   > Thank you for the improvement! 2 concerns.
   > Also could you point an example which require custom gradient handler? 
(gradient clipping or aggregation)
   
   Thank u for the review. 
   
   For the gradient update example, one use case of using gradient accumulation 
appears when training a transformer. 
(https://github.com/dmlc/gluon-nlp/blob/master/scripts/machine_translation/train_transformer.py#L320)
 Because the size of parameters in the transformer network is too large, we can 
compute gradient for a small batch of data examples. In this case, the gradient 
is updated periodically on the weight parameters.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] liuzh91 commented on issue #16900: introduce gradient update handler to the base estimator

Reply via email to