Re: Spark LIBLINEAR

Tom Vacek Fri, 16 May 2014 08:21:47 -0700

I've done some comparisons with my own implementation of TRON on Spark.
 From a distributed computing perspective, it does 2x more local work per
iteration than LBFGS, so the parallel isoefficiency is improved slightly.
 I think the truncated Newton solver holds some potential because there
have been some recent work in preconditioners:
http://dx.doi.org/10.1016/j.amc.2014.03.006



On Wed, May 14, 2014 at 9:32 AM, Debasish Das <debasish.da...@gmail.com>wrote:

> Hi Professor Lin,
>
> On our internal datasets,  I am getting accuracy at par with glmnet-R for
> sparse feature selection from liblinear. The default mllib based gradient
> descent was way off. I did not tune learning rate but I run with varying
> lambda. Ths feature selection was weak.
>
> I used liblinear code. Next I will explore the distributed liblinear.
>
> Adding the code on github will definitely help for collaboration.
>
> I am experimenting if a bfgs / owlqn based sparse logistic in spark mllib
> give us accuracy at par with liblinear.
>
> If liblinear solver outperforms them (either accuracy/performance) we have
> to bring tron to mllib and let other algorithms benefit from it as well.
>
> We are using Bfgs and Owlqn solvers from breeze opt.
>
> Thanks.
> Deb
>  On May 12, 2014 9:07 PM, "DB Tsai" <dbt...@stanford.edu> wrote:
>
>> It seems that the code isn't managed in github. Can be downloaded from
>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/spark/spark-liblinear-1.94.zip
>>
>> It will be easier to track the changes in github.
>>
>>
>>
>> Sincerely,
>>
>> DB Tsai
>> -------------------------------------------------------
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>
>>
>> On Mon, May 12, 2014 at 7:53 AM, Xiangrui Meng <men...@gmail.com> wrote:
>>
>>> Hi Chieh-Yen,
>>>
>>> Great to see the Spark implementation of LIBLINEAR! We will definitely
>>> consider adding a wrapper in MLlib to support it. Is the source code
>>> on github?
>>>
>>> Deb, Spark LIBLINEAR uses BSD license, which is compatible with Apache.
>>>
>>> Best,
>>> Xiangrui
>>>
>>> On Sun, May 11, 2014 at 10:29 AM, Debasish Das <debasish.da...@gmail.com>
>>> wrote:
>>> > Hello Prof. Lin,
>>> >
>>> > Awesome news ! I am curious if you have any benchmarks comparing C++
>>> MPI
>>> > with Scala Spark liblinear implementations...
>>> >
>>> > Is Spark Liblinear apache licensed or there are any specific
>>> restrictions on
>>> > using it ?
>>> >
>>> > Except using native blas libraries (which each user has to manage by
>>> pulling
>>> > in their best proprietary BLAS package), all Spark code is Apache
>>> licensed.
>>> >
>>> > Thanks.
>>> > Deb
>>> >
>>> >
>>> > On Sun, May 11, 2014 at 3:01 AM, DB Tsai <dbt...@stanford.edu> wrote:
>>> >>
>>> >> Dear Prof. Lin,
>>> >>
>>> >> Interesting! We had an implementation of L-BFGS in Spark and already
>>> >> merged in the upstream now.
>>> >>
>>> >> We read your paper comparing TRON and OWL-QN for logistic regression
>>> with
>>> >> L1 (http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf), but it seems
>>> that it's
>>> >> not in the distributed setup.
>>> >>
>>> >> Will be very interesting to know the L2 logistic regression benchmark
>>> >> result in Spark with your TRON optimizer and the L-BFGS optimizer
>>> against
>>> >> different datasets (sparse, dense, and wide, etc).
>>> >>
>>> >> I'll try your TRON out soon.
>>> >>
>>> >>
>>> >> Sincerely,
>>> >>
>>> >> DB Tsai
>>> >> -------------------------------------------------------
>>> >> My Blog: https://www.dbtsai.com
>>> >> LinkedIn: https://www.linkedin.com/in/dbtsai
>>> >>
>>> >>
>>> >> On Sun, May 11, 2014 at 1:49 AM, Chieh-Yen <r01944...@csie.ntu.edu.tw
>>> >
>>> >> wrote:
>>> >>>
>>> >>> Dear all,
>>> >>>
>>> >>> Recently we released a distributed extension of LIBLINEAR at
>>> >>>
>>> >>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/
>>> >>>
>>> >>> Currently, TRON for logistic regression and L2-loss SVM is supported.
>>> >>> We provided both MPI and Spark implementations.
>>> >>> This is very preliminary so your comments are very welcome.
>>> >>>
>>> >>> Thanks,
>>> >>> Chieh-Yen
>>> >>
>>> >>
>>> >
>>>
>>
>>

Re: Spark LIBLINEAR

Reply via email to