Hi DB,

I am considering building on your PR and add Mallet as the dependency so
that we can run some basic comparisons test on large scale sparse datasets
that I have.

In the meantime, let's discuss if there are other optimization packages
that we should try.

My wishlist has bounded bfgs as well and I will add it to the PR.

About the PR getting merged to mllib, we can plan that later.

Thanks.
Deb



On Tue, Feb 25, 2014 at 11:36 AM, DB Tsai <dbt...@alpinenow.com> wrote:

> I find some comparison between Mallet vs Fortran version. The result
> is closed but not the same.
>
> http://t3827.ai-mallet-development.aitalk.info/help-with-l-bfgs-t3827.html
>
> Here is LBFGS-B
> Cost: 0.6902411220175793
> Gradient: -5.453609E-007, -2.858372E-008, -1.369706E-007
> Theta: -0.014186210102171406, -0.303521206706629, -0.018132348904129902
>
> And Mallet LBFGS (Tollerance .000000000000001)
> Cost: 0.6902412268833071
> Gradient: 0.000117, -4.615523E-005, 0.000114
> Theta: -0.013914961040040107, -0.30419883021414335, -0.016838481937958744
>
> So this shows me, that Mallet is close, but Plain ol Gradient Descent
> and LBFGS-B are really close.
> I see that Mallet also has a "LineOptimizer" and "Evaluator" that I
> have yet to explore...
>
> Sincerely,
>
> DB Tsai
> Machine Learning Engineer
> Alpine Data Labs
> --------------------------------------
> Web: http://alpinenow.com/
>
>
> On Tue, Feb 25, 2014 at 11:16 AM, DB Tsai <dbt...@alpinenow.com> wrote:
> > Hi Deb,
> >
> > On Tue, Feb 25, 2014 at 7:07 AM, Debasish Das <debasish.da...@gmail.com>
> wrote:
> >> Continuation on last email sent by mistake:
> >>
> >> Is cpl license is compatible with apache ?
> >>
> >> http://opensource.org/licenses/cpl1.0.php
> >
> > Based on what I read here, there is no problem to include CPL code in
> > apache project
> > as long as the code isn't modified, and we include the maven binary.
> > https://www.apache.org/legal/3party.html
> >
> >> Mallet jars are available on maven. They have hessian based solvers
> which
> >> looked interesting along with bfgs and cg.
> >
> > We found that hessian based solvers don't scale as the # of features
> grow, and
> > we have lots of customers trying to train sparse input. That's our
> motivation to
> > work on L-BFGS which approximate hessian using just a few vectors.
> >
> > Just take a look at MALLET, and it does have L-BFGS and its variant
> OWL-QN
> > which can tackle L1 problem. Since implementing L-BFGS is very subtle, I
> don't
> > know the quality of the mallet implementation. Personally, I
> > implemented one based
> > on textbook, and not very stable. If MALLET is robust, I'll go for it
> > since it has more
> > features, and already in maven.
> >
> >> Note that right now the version is not blas optimized. With jblas or
> >> netlib-java discussions that's going on it can be improved. Also it
> runs on
> >> a single thread which can be improved...so there is scope for further
> >> improvements in the code.
> >
> > I think it will not impact performance even it's not blas optimized
> > nor multi-threaded,
> > since most of the parallelization is in computing gradientSum and
> > lossSum in Spark,
> > and the optimizer just takes gradientSum, lossSum, and weights to get
> > the newWeights.
> >
> > As a result, 99.9% of time is in computing gradientSum and lossSum.
> > Only small amount
> > of time is in optimization.
> >
> >>
> >> Basically Xiangrui, is there a push back on making optimizers part of
> spark
> >> mllib ? I am exploring cg and qp solvers for spark mllib as well and I
> am
> >> developing these as part of mllib optimization. I was hoping we should
> be
> >> able to publish mllib as a maven artifact later.
> >>
> >> Thanks.
> >> Deb
> >
> > Thanks.
> >
> > Sincerely,
> >
> > DB Tsai
> > Machine Learning Engineer
> > Alpine Data Labs
> > --------------------------------------
> > Web: http://alpinenow.com/
>

Reply via email to