Hi everyone, A few months ago, I posted on this list about how lars_path(method='lasso') was much faster than lasso_path, when one has to compute lasso for many different regularization parameters (alphas).
Basically the idea is that lars_path(method='lasso') returns all the kinks (hitting points) of lasso, and we simply have to do a linear interpolation to find the coefs for any alpha ; whereas lasso_path actually computes a sgd for each such alpha. This gist shows a first hackish implementation of this idea (lines 33-64) https://gist.github.com/cpa/4956775 Speedups speak for themselves (2000-fold when computing more than 1800 different alphas!) Here are some plots: http://imgur.com/B9EIVwq First row is about correctness of the new method compared to the old one: Let x1, x2 be the old coefs and the new coefs (by the method I propose). First plot is an histogram of log10(| |x1| - |x2| |) and the second plot is boxplot of log10( | |x1| - |x2| | ). For the sake of legibility, I've set log10(0) == -10. The second row is about speed: first plot shows the runtime of lasso_path and the new code and the second plot shows the achieved speedup. I would like some help in order to create a meaningful Pull Request! * Now that I've computed the lasso coefs for each alpha, I need to put them in an ElasticNet object (or Lasso object but lasso_path uses ElasticNet, so I think it's better to keep that in order to avoid breaking things) ; is setting the values of coef_, intercept_, alpha, l1_ratio enough? * Coefs may vary a bit from the results that I get from lasso_path, especially for small alphas (which is to be excepted), but I may have forgotten an edge case. * I suck at using x.std() as I'm never quite sure when to do that and what's the convention in sklearn. Can you give me some clarifications on that? * Should I write tests? If so, what should I test for and how? So far, I've only done tests on datasets from the load_* functions. * How about sparsity? It is not something I'm very knowledgeable about… Cheers, -- Charles-Pierre ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
