Re: [Scikit-learn-general] Speeding up lasso_path by a factor of 2000

Charles-Pierre Astolfi Thu, 14 Feb 2013 23:37:27 -0800

That was my idea at first, but I was afraid of breaking things.
As of today, lasso_path returns a list of ElasticNet and that'd become
a list of LinearModel and it seemed like I would break
backward-compatibility.


If that's not an issue, I definitely think that wrapping stuff up in a
LinearModel (or LassoLars) would make more sense.
-- 
Cp


On Fri, Feb 15, 2013 at 12:29 AM, Olivier Grisel
<[email protected]> wrote:
> 2013/2/14 Charles-Pierre Astolfi <[email protected]>:
>> Hi everyone,
>>
>> A few months ago, I posted on this list about how
>> lars_path(method='lasso') was much faster than lasso_path, when one
>> has to compute lasso for many different regularization parameters
>> (alphas).
>>
>> Basically the idea is that lars_path(method='lasso') returns all the
>> kinks (hitting points) of lasso, and we simply have to do a linear
>> interpolation to find the coefs for any alpha ; whereas lasso_path
>> actually computes a sgd for each such alpha.
>>
>> This gist shows a first hackish implementation of this idea (lines 33-64)
>> https://gist.github.com/cpa/4956775
>>
>> Speedups speak for themselves (2000-fold when computing more than 1800
>> different alphas!)
>>
>> Here are some plots:
>> http://imgur.com/B9EIVwq
>>
>> First row is about correctness of the new method compared to the old one:
>> Let x1, x2 be the old coefs and the new coefs (by the method I propose).
>> First plot is an histogram of log10(| |x1| - |x2| |) and the second
>> plot is boxplot of log10( | |x1| - |x2| | ). For the sake of
>> legibility, I've set log10(0) == -10.
>>
>> The second row is about speed: first plot shows the runtime of
>> lasso_path and the new code and the second plot shows the achieved
>> speedup.
>>
>>
>> I would like some help in order to create a meaningful Pull Request!
>>
>> * Now that I've computed the lasso coefs for each alpha, I need to put
>> them in an ElasticNet object (or Lasso object but lasso_path uses
>> ElasticNet, so I think it's better to keep that in order to avoid
>> breaking things) ; is setting the values of coef_, intercept_, alpha,
>> l1_ratio enough?
>
> I don't really understand what is the goal of putting it back into a
> Lasso / ElasticNet class.
>
> This is a Lasso LARS solution (+ interpolated points), why do you want
> to make it look like a Coordinate Descent solution?
>
> Why not keep the LassoLars class or even just wrap them as LinearModel
> instances?
>
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/least_angle.py#L586
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/base.py#L119
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Speeding up lasso_path by a factor of 2000

Reply via email to