[jira] [Resolved] (SPARK-18023) Adam optimizer

Hyukjin Kwon (JIRA) Mon, 20 May 2019 21:39:55 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-18023.
----------------------------------
    Resolution: Incomplete

> Adam optimizer
> --------------
>
>                 Key: SPARK-18023
>                 URL: https://issues.apache.org/jira/browse/SPARK-18023
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>            Reporter: Vincent
>            Priority: Minor
>              Labels: bulk-closed
>
> It could be incredibly slow for SGD methods to diverge or converge if their  
> learning rate alpha are set inappropriately, many alternative methods have 
> been proposed to produce desirable convergence with less dependence on 
> hyperparameter settings, and to help prevent local optimum, e.g. Momentom, 
> NAG (Nesterov's Accelerated Gradient), Adagrad, RMSProp etc.
> Among which, Adam is one of the popular algorithms, which is for first-order 
> gradient-based optimization of stochastic objective functions. It's proved to 
> be well suited for problems with large data and/or parameters, and for 
> problems with noisy and/or sparse gradients and is computationally efficient. 
> Refer to this paper for details<https://arxiv.org/pdf/1412.6980v8.pdf>
> In fact, Tensorflow has implemented most of the adaptive optimization methods 
> mentioned, and we have seen that Adam out performs most of SGD methods in 
> certain cases, such as very sparse dataset in a FM model.
> It could be nice for Spark to have these adaptive optimization methods. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-18023) Adam optimizer

Reply via email to