szha opened a new issue #12859: proper support for multi-gpu training in 
clip_global_norm
URL: https://github.com/apache/incubator-mxnet/issues/12859
 
 
   currently the multi-gpu training is not well supported in clip_global_norm, 
as the most straightforward way of using clip_global_norm (i.e. 
`clip_global_norm([p.grad(ctx) for p in params for ctx in contexts]`) is 
incorrect, because it treats the different NDArrays on different contexts for 
the same parameter as different and independent parameters. imo 
clip_global_norm should take parameters instead of ndarrays so that the 
information of multiple gradient copies on different contexts is preserved.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to