Janardhan created SYSTEMML-2018:
-----------------------------------
Summary: FIXING WEIGHT DECAY REGULARIZATION IN ADAM
Key: SYSTEMML-2018
URL: https://issues.apache.org/jira/browse/SYSTEMML-2018
Project: SystemML
Issue Type: Improvement
Components: Algorithms
Reporter: Janardhan
The common implementations of adaptive gradient algorithms, such
as Adam, limit the potential benefit of weight decay regularization, because the
weights do not decay multiplicatively (as would be expected for standard weight
decay) but by an additive constant factor.
This following paper found a way to fix regularization in Adam Optimization
with one addition step(+ wx) to the gradient step :
https://arxiv.org/pdf/1711.05101.pdf
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)