Hi everyone,

A few months ago, I posted on this list about how
lars_path(method='lasso') was much faster than lasso_path, when one
has to compute lasso for many different regularization parameters
(alphas).

Basically the idea is that lars_path(method='lasso') returns all the
kinks (hitting points) of lasso, and we simply have to do a linear
interpolation to find the coefs for any alpha ; whereas lasso_path
actually computes a sgd for each such alpha.

This gist shows a first hackish implementation of this idea (lines 33-64)
https://gist.github.com/cpa/4956775

Speedups speak for themselves (2000-fold when computing more than 1800
different alphas!)

Here are some plots:
http://imgur.com/B9EIVwq

First row is about correctness of the new method compared to the old one:
Let x1, x2 be the old coefs and the new coefs (by the method I propose).
First plot is an histogram of log10(| |x1| - |x2| |) and the second
plot is boxplot of log10( | |x1| - |x2| | ). For the sake of
legibility, I've set log10(0) == -10.

The second row is about speed: first plot shows the runtime of
lasso_path and the new code and the second plot shows the achieved
speedup.


I would like some help in order to create a meaningful Pull Request!

* Now that I've computed the lasso coefs for each alpha, I need to put
them in an ElasticNet object (or Lasso object but lasso_path uses
ElasticNet, so I think it's better to keep that in order to avoid
breaking things) ; is setting the values of coef_, intercept_, alpha,
l1_ratio enough?

* Coefs may vary a bit from the results that I get from lasso_path,
especially for small alphas (which is to be excepted), but I may have
forgotten an edge case.

* I suck at using x.std() as I'm never quite sure when to do that and
what's the convention in sklearn. Can you give me some clarifications
on that?

* Should I write tests? If so, what should I test for and how? So far,
I've only done tests on datasets from the load_* functions.

* How about sparsity? It is not something I'm very knowledgeable about…


Cheers,
-- 
Charles-Pierre

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to