Hi devs: Normally, the adaptive learning rate methods can have a fast convergence then standard SGD, so why don't we imp them? see the link for more details http://sebastianruder.com/optimizing-gradient-descent/index.html#adadelta
-- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Why-don-t-we-imp-some-adaptive-learning-rate-methods-such-as-adadelat-adam-tp20057.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org