On Fri, Feb 15, 2013 at 03:45:03PM +0100, Olivier Grisel wrote: > Alright for minimizing the RMSE on the training set but on the test > set the error might be minimized in between kinks as on the on real > distribution, the kinks location might be slightly off the positions > found on the training set or am I making a mistake?
Darn, I made the mistake myself. I remembered that I had to think about this carefully when I wrote LassoLarsCV, but I spoke too quickly this morning (that's the problem with doing too many things in parallel). Charles-Pierre, please accept my apologies. The problem is not that the kink positions and shifted, the problem is that quadratic function. The residuals are given by: res = y - np.dot(X, coef) where X and y can be train, test, or whatever you want. The RMSE is np.sqrt(np.sum(res ** 2)), I like to think about the square of the RMSE, because it is easier, and has the same minima: np.sum(res ** 2). Anyhow, this is not a piecewise affine function, it is a piecewise quadratic function, and can have minima between knots. My claim was bullshit. Sorry Charles-Pierre. However, there was a motivation behind writing the _lars_path_residuals function, and to make up for my false claims, let me walk you through how it can be used to efficiently solve your problem: The good news is that the differential of the squared residual norms is piecewise affine: D = np.dot(X.T, res) So to find where the minimum in path actually lies, we can find the 3 knots in the path bordering the minimum by a simple argsort. Or, more cleanly, check where all derivative changes sign: not np.all((D_k >= 0) * (D_(k - 1) <= 0)) Where the 'k' index denotes the knot in the path. Finding the exact location of the minimum is then a question of solving D == 0, using the fact that coef (or more simple res) is affine in this domain. This is a simple linear algebra problem. This, I believe, is the right way to do it. While it may seem more complex/costly to do than a simple grid, I have found that it is much more robust, because the knots of the path concentrate in the interesting regions, and to cover these regions as well with a grid requires many points). I have not implement this specific approach, sorry, just a variant, so I cannot provide the code. However, the _lars_path_residuals is the cornerstone of it. You can also have a look at the LassoCV, which contains useful code (and that can be improved). Actually, if you want to contribute a function that given an X matrices, and residuals and alphas corresonding to knots in the path, implements the above strategy, that would be a very useful contribution. Thanks for the discussion! Gaƫl ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
