If it is possible, I would like to have both.

L-BFGS converges faster than SGD. But it goes through the entire data set
before moving from one iteration to the next.
Whereas, SGD uses a minit-batch of the training data set for calculating
and updating its gradient.
Hence, for large data sets SGD is more practical than L-BFGS.

I think we can test this scenario by running these two algorithms against a
large data set (~ 1GB)

Thanks,
Upul

On Sun, May 31, 2015 at 8:02 PM, Nirmal Fernando <[email protected]> wrote:

> One other benefit of switching is, this API supports multi-class
> classification too. I've tested this API with Iris dataset.
>
> On Sun, May 31, 2015 at 7:33 PM, Nirmal Fernando <[email protected]> wrote:
>
>> Hi,
>>
>> Currently in ML, we use mini-batch gradient descent algorithm when
>> running logistic regression. But Spark-mllib recommends L-BFGS over
>> mini-batch gradient descent for faster convergence [1].
>>
>> I tested both the implementation with the same dataset and gained an
>> improved accuracy in L-BFGS (80% vs 67% for SGD).
>>
>> Shall we switch?
>>
>> [1]
>> https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Upul Bandara,
Associate Technical Lead, WSO2, Inc.,
Mob: +94 715 468 345.
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to