I find some comparison between Mallet vs Fortran version. The result is closed but not the same.
http://t3827.ai-mallet-development.aitalk.info/help-with-l-bfgs-t3827.html Here is LBFGS-B Cost: 0.6902411220175793 Gradient: -5.453609E-007, -2.858372E-008, -1.369706E-007 Theta: -0.014186210102171406, -0.303521206706629, -0.018132348904129902 And Mallet LBFGS (Tollerance .000000000000001) Cost: 0.6902412268833071 Gradient: 0.000117, -4.615523E-005, 0.000114 Theta: -0.013914961040040107, -0.30419883021414335, -0.016838481937958744 So this shows me, that Mallet is close, but Plain ol Gradient Descent and LBFGS-B are really close. I see that Mallet also has a "LineOptimizer" and "Evaluator" that I have yet to explore... Sincerely, DB Tsai Machine Learning Engineer Alpine Data Labs -------------------------------------- Web: http://alpinenow.com/ On Tue, Feb 25, 2014 at 11:16 AM, DB Tsai <dbt...@alpinenow.com> wrote: > Hi Deb, > > On Tue, Feb 25, 2014 at 7:07 AM, Debasish Das <debasish.da...@gmail.com> > wrote: >> Continuation on last email sent by mistake: >> >> Is cpl license is compatible with apache ? >> >> http://opensource.org/licenses/cpl1.0.php > > Based on what I read here, there is no problem to include CPL code in > apache project > as long as the code isn't modified, and we include the maven binary. > https://www.apache.org/legal/3party.html > >> Mallet jars are available on maven. They have hessian based solvers which >> looked interesting along with bfgs and cg. > > We found that hessian based solvers don't scale as the # of features grow, and > we have lots of customers trying to train sparse input. That's our motivation > to > work on L-BFGS which approximate hessian using just a few vectors. > > Just take a look at MALLET, and it does have L-BFGS and its variant OWL-QN > which can tackle L1 problem. Since implementing L-BFGS is very subtle, I don't > know the quality of the mallet implementation. Personally, I > implemented one based > on textbook, and not very stable. If MALLET is robust, I'll go for it > since it has more > features, and already in maven. > >> Note that right now the version is not blas optimized. With jblas or >> netlib-java discussions that's going on it can be improved. Also it runs on >> a single thread which can be improved...so there is scope for further >> improvements in the code. > > I think it will not impact performance even it's not blas optimized > nor multi-threaded, > since most of the parallelization is in computing gradientSum and > lossSum in Spark, > and the optimizer just takes gradientSum, lossSum, and weights to get > the newWeights. > > As a result, 99.9% of time is in computing gradientSum and lossSum. > Only small amount > of time is in optimization. > >> >> Basically Xiangrui, is there a push back on making optimizers part of spark >> mllib ? I am exploring cg and qp solvers for spark mllib as well and I am >> developing these as part of mllib optimization. I was hoping we should be >> able to publish mllib as a maven artifact later. >> >> Thanks. >> Deb > > Thanks. > > Sincerely, > > DB Tsai > Machine Learning Engineer > Alpine Data Labs > -------------------------------------- > Web: http://alpinenow.com/