eric-haibin-lin commented on a change in pull request #16893: Multi-tensor LAMB
URL: https://github.com/apache/incubator-mxnet/pull/16893#discussion_r352876302
 
 

 ##########
 File path: python/mxnet/optimizer/optimizer.py
 ##########
 @@ -1051,6 +1052,95 @@ def update_multi_precision(self, index, weight, grad, 
state):
         self._update_impl(index, weight, grad, state,
                           multi_precision=use_multi_precision)
 
+@register
+class MultiLAMB(Optimizer):
+    """multiLAMB optimizer.
+    """
+    def __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, 
epsilon=1e-6,
 
 Review comment:
   Can we set the default value of lower_bound = None, upperbound = None, and 
bias_correction=True? Both TF and Pytorch are using the version that does not 
bound the value of r1 and perform bias correction by default. We should use the 
same interface as the existing lamb optimizer: 
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/optimizer/optimizer.py#L1249-L1254
   
   I noticed that the GPU kernel always performs upper/lower bound clipping on 
r1. Can we add a branch to skip clipping if upper/lower bound = -1? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to