I find some comparison between Mallet vs Fortran version. The result
is closed but not the same.

http://t3827.ai-mallet-development.aitalk.info/help-with-l-bfgs-t3827.html

Here is LBFGS-B
Cost: 0.6902411220175793
Gradient: -5.453609E-007, -2.858372E-008, -1.369706E-007
Theta: -0.014186210102171406, -0.303521206706629, -0.018132348904129902

And Mallet LBFGS (Tollerance .000000000000001)
Cost: 0.6902412268833071
Gradient: 0.000117, -4.615523E-005, 0.000114
Theta: -0.013914961040040107, -0.30419883021414335, -0.016838481937958744

So this shows me, that Mallet is close, but Plain ol Gradient Descent
and LBFGS-B are really close.
I see that Mallet also has a "LineOptimizer" and "Evaluator" that I
have yet to explore...

Sincerely,

DB Tsai
Machine Learning Engineer
Alpine Data Labs
--------------------------------------
Web: http://alpinenow.com/


On Tue, Feb 25, 2014 at 11:16 AM, DB Tsai <dbt...@alpinenow.com> wrote:
> Hi Deb,
>
> On Tue, Feb 25, 2014 at 7:07 AM, Debasish Das <debasish.da...@gmail.com> 
> wrote:
>> Continuation on last email sent by mistake:
>>
>> Is cpl license is compatible with apache ?
>>
>> http://opensource.org/licenses/cpl1.0.php
>
> Based on what I read here, there is no problem to include CPL code in
> apache project
> as long as the code isn't modified, and we include the maven binary.
> https://www.apache.org/legal/3party.html
>
>> Mallet jars are available on maven. They have hessian based solvers which
>> looked interesting along with bfgs and cg.
>
> We found that hessian based solvers don't scale as the # of features grow, and
> we have lots of customers trying to train sparse input. That's our motivation 
> to
> work on L-BFGS which approximate hessian using just a few vectors.
>
> Just take a look at MALLET, and it does have L-BFGS and its variant OWL-QN
> which can tackle L1 problem. Since implementing L-BFGS is very subtle, I don't
> know the quality of the mallet implementation. Personally, I
> implemented one based
> on textbook, and not very stable. If MALLET is robust, I'll go for it
> since it has more
> features, and already in maven.
>
>> Note that right now the version is not blas optimized. With jblas or
>> netlib-java discussions that's going on it can be improved. Also it runs on
>> a single thread which can be improved...so there is scope for further
>> improvements in the code.
>
> I think it will not impact performance even it's not blas optimized
> nor multi-threaded,
> since most of the parallelization is in computing gradientSum and
> lossSum in Spark,
> and the optimizer just takes gradientSum, lossSum, and weights to get
> the newWeights.
>
> As a result, 99.9% of time is in computing gradientSum and lossSum.
> Only small amount
> of time is in optimization.
>
>>
>> Basically Xiangrui, is there a push back on making optimizers part of spark
>> mllib ? I am exploring cg and qp solvers for spark mllib as well and I am
>> developing these as part of mllib optimization. I was hoping we should be
>> able to publish mllib as a maven artifact later.
>>
>> Thanks.
>> Deb
>
> Thanks.
>
> Sincerely,
>
> DB Tsai
> Machine Learning Engineer
> Alpine Data Labs
> --------------------------------------
> Web: http://alpinenow.com/

Reply via email to