Hi Fernando,
On Sun, Jun 29, 2014 at 1:53 PM, Fernando Paolo <fpa...@ucsd.edu> wrote:
> Hello,
>
> I must be missing something obvious because I can't find the "actual"
> coefficients of the polynomial fitted using LassoCV. That is, for a 3rd
> degree polynomial
>
> p = a0 + a1 * x + a2 * x^2 + a3 * x^3
>
> I want the a0, a1, a2 and a3 coefficients (as those returned by
> numpy.polyfit()). Here is an example code of what I'm after
>
> import numpy as np
> import matplotlib.pyplot as plt
> from pandas import *
> from math import *
> from patsy import dmatrix
> from sklearn.linear_model import LassoCV
>
> sin_data = DataFrame({'x' : np.linspace(0, 1, 101)})
> sin_data['y'] = np.sin(2 * pi * sin_data['x']) + np.random.normal(0, 0.1,
> 101)
> x = sin_data['x']
> y = sin_data['y']
> Xpoly = dmatrix('C(x, Poly)')
>
The development version of scikit-learn contains a transformer to do
exactly this:
http://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.PolynomialFeatures.html
> n = 3
> lasso_model = LassoCV(cv=15, copy_X=True, normalize=True)
> lasso_fit = lasso_model.fit(Xpoly[:,1:n+1], y)
>
In scikit-learn, "fit" always returns the model itself so here
"lasso_model" and "lasso_fit" refer to the same thing.
lasso_predict = lasso_model.predict(Xpoly[:,1:n+1])
>
> a = np.r_[lasso_fit.intercept_, lasso_fit.coef_]
>
> b = np.polyfit(x, y, n)[::-1]
>
> p_lasso = a[0] + a[1] * x + a[2] * x**2 + a[3] * x**3
> p_polyfit = b[0] + b[1] * x + b[2] * x**2 + b[3] * x**3
>
> print 'coef. lasso:', a
> print 'coef. polyfit:', b
>
>
> The returned coefficients 'a' and 'b' are completely different, and while
> 'p_polyfit' is indeed the fitted polynomial of degree 3, 'p_lasso' makes no
> sense (plot to see). Unless 'b' is something else... If so, what actually
> are the coefficients returned by fit()? And how can I get the coefficients
> that reconstruct the fitted polynomial?
>
>
Why are you expecting a and b to be the same? np.polyfit returns a
least-squares fit so the model is different from a lasso.
You should use LinearRegression or Ridge with light regularization instead.
HTH,
Mathieu
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general