Hi devs:
    Normally, the adaptive learning rate methods can have a fast convergence
then standard SGD, so why don't we imp them?
see the link for more details 
http://sebastianruder.com/optimizing-gradient-descent/index.html#adadelta



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Why-don-t-we-imp-some-adaptive-learning-rate-methods-such-as-adadelat-adam-tp20057.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to