If it is possible, I would like to have both. L-BFGS converges faster than SGD. But it goes through the entire data set before moving from one iteration to the next. Whereas, SGD uses a minit-batch of the training data set for calculating and updating its gradient. Hence, for large data sets SGD is more practical than L-BFGS.
I think we can test this scenario by running these two algorithms against a large data set (~ 1GB) Thanks, Upul On Sun, May 31, 2015 at 8:02 PM, Nirmal Fernando <[email protected]> wrote: > One other benefit of switching is, this API supports multi-class > classification too. I've tested this API with Iris dataset. > > On Sun, May 31, 2015 at 7:33 PM, Nirmal Fernando <[email protected]> wrote: > >> Hi, >> >> Currently in ML, we use mini-batch gradient descent algorithm when >> running logistic regression. But Spark-mllib recommends L-BFGS over >> mini-batch gradient descent for faster convergence [1]. >> >> I tested both the implementation with the same dataset and gained an >> improved accuracy in L-BFGS (80% vs 67% for SGD). >> >> Shall we switch? >> >> [1] >> https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > > Thanks & regards, > Nirmal > > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- Upul Bandara, Associate Technical Lead, WSO2, Inc., Mob: +94 715 468 345.
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
