[R] lm.ridge fit with some dummy variables (w/o intercept)
I have columns which sum to one. They are membership dummies, fractions are allowed - I made an example: x <- c( 9.899898,6.9555431,-1.251,0.5200,0.480,0.000,-2.2384737, 16.791361,6.8924369,-3.286,0.78846154,0.2115385,0.000,-0.4720061, 6.115735,-5.8381799,-1.176,1.,0.000,0.000,-0.6312019, 10.325595,5.4950276,-2.634,1.,0.000,0.000,1.4729420, 3.800141,4.1287662,-2.243,0.8300,0.170,0.000,,0.9314859, 2.159567,-2.3952889,-4.645,0.5300,0.000,0.470,0.7252069, 21.536111,3.3844964,-4.352,1.,0.000,0.000,-0.9931833, 7.526573,-1.1675684,-5.023,1.,0.000,0.000,0.2397390, 28.684897,-0.4594389,-3.233,0.8900,0.070,0.040,0.6017004, 0.894931,-0.9059129,-5.023,0.04347826,0.000,0.9565217,0.6081505) x = matrix(x,ncol=10) x = t(x) colnames(x) = c('a','b','c','d','e','f','y') > x a b c d e f y [1,] 9.899898 6.9555431 -1.251 0.5200 0.480 0.000 -2.2384737 [2,] 16.791361 6.8924369 -3.286 0.78846154 0.2115385 0.000 -0.4720061 [3,] 6.115735 -5.8381799 -1.176 1. 0.000 0.000 -0.6312019 [4,] 10.325595 5.4950276 -2.634 1. 0.000 0.000 1.4729420 [5,] 3.800141 4.1287662 -2.243 0.8300 0.170 0.000 0.9314859 [6,] 2.159567 -2.3952889 -4.645 0.5300 0.000 0.470 0.7252069 [7,] 21.536111 3.3844964 -4.352 1. 0.000 0.000 -0.9931833 [8,] 7.526573 -1.1675684 -5.023 1. 0.000 0.000 0.2397390 [9,] 28.684897 -0.4594389 -3.233 0.8900 0.070 0.040 0.6017004 [10,] 0.894931 -0.9059129 -5.023 0.04347826 0.000 0.9565217 0.6081505 > apply(x[,4:6],1,sum) [1] 1 1 1 1 1 1 1 1 1 1 I am trying to use lm.ridge and got some problems on how to extract parameter estimates. E.g., for lambda = 0 case (I cut and pasted at the bottom), how to backout the coef estimate to match them with lm fit? In general, for any given lambda, how to back out the original scale coef estimates? > lm.fit = lm(y~.-a-1,data=data.frame(x),weights=a) > ridge.fit = lm.ridge(y~.-a-1,data=data.frame(x),weights=a,lambda=0) > ridge.fit b c d e f 0.1125886 0.1748883 0.9122774 -5.9140208 1.8784332 > lm.fit Call: lm(formula = y ~ . - a - 1, data = data.frame(x), weights = a) Coefficients: b c d e f 0.04232 0.32343 1.36039 -4.67399 3.29727 Thanks so much in advance! Young [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm.ridge
On Sat, 8 Oct 2005, Erin Hodgess wrote: > Dear R People: > > I have a question about the lm.ridge function, please. > > In the example, there is one set of output values in the "select" > function but another in the comment section. > > Am I missing something please? The values in the examples were computed in S-PLUS. Apparently the dataset in R is not the same as that in S-PLUS. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lm.ridge
Dear R People: I have a question about the lm.ridge function, please. In the example, there is one set of output values in the "select" function but another in the comment section. Am I missing something please? R Version 2.1.1 Windows Thanks, Sincerely, Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lm.ridge
G'day Daniel, > "DR" == daniel <[EMAIL PROTECTED]> writes: DR> First: I think coefficients from lm(Employed~.,data=longley) DR> should be equal coefficients from DR> lm.ridge(Employed~.,data=longley, lambda=0) why it does not DR> happen? Which version of R and which version of MASS are you using? > lm(Employed~.,data=longley) Call: lm(formula = Employed ~ ., data = longley) Coefficients: (Intercept) GNP.deflator GNPUnemployed Armed.Forces -3.482e+03 1.506e-02-3.582e-02-2.020e-02-1.033e-02 Population Year -5.110e-02 1.829e+00 > lm.ridge(Employed~.,data=longley, lambda=0) GNP.deflator GNPUnemployed Armed.Forces -3.482259e+03 1.506187e-02 -3.581918e-02 -2.020230e-02 -1.033227e-02 Population Year -5.110411e-02 1.829151e+00 These coefficients look pretty identical to me, except that they are printed to different numbers of significant digits. In fact, the following shows that they are identical (upto numerical precision): > fm1 <- lm(Employed~.,data=longley) > fm2 <- lm.ridge(Employed~.,data=longley, lambda=0) > coef2 <- print(fm2) GNP.deflator GNPUnemployed Armed.Forces -3.482259e+03 1.506187e-02 -3.581918e-02 -2.020230e-02 -1.033227e-02 Population Year -5.110411e-02 1.829151e+00 > max(abs(coef(fm1)-coef2)) [1] 7.275958e-12 DR> Second: if I have for example Ridge<-lm.ridge(Employed~., DR> data=longley, lambda = seq(0,0.1,0.001)), I suppose intercept DR> coefficient is defined implicit, Yes. DR> why it does not appear in Ridge$coef? If you look at the code of lm.ridge, you will see that, if an intercept is included in the model, all non-constant regressors are centered (i.e. made orthogonal to the intercept term) and scaled to have the same variance. Further more, the intercept term is typically *not* penalised. The components in Ridge$coef are the coefficients on this transformed scale. No need of including the intercept here, since it is the same for all values of lambda. If you print the model, then the ridge coefficients on the original scale are calculated, see: > getAnywhere("print.ridgelm") A single object matching 'print.ridgelm' was found It was found in the following places registered S3 method for print from namespace MASS namespace:MASS with value function (x, ...) { scaledcoef <- t(as.matrix(x$coef/x$scales)) if (x$Inter) { inter <- x$ym - scaledcoef %*% x$xm scaledcoef <- cbind(Intercept = inter, scaledcoef) } print(drop(scaledcoef), ...) } DR> Third: I suppose that if I define DR> 1) y<-longley$Employed DR> 2) X<-as.matrix(cbind(1,Longley[,1:6]) DR> 3) I = identity matrix the DR> following should be true: Coef=(X'X+kI)^(-1) X'y No, as noted above, the intercept term is usually not penalised. DR> and if a take k=Ridge$kHKV, Coef should be approx equal to DR> Ridge$Coef[near value of kHKV] No, as noted above the estimates in the "coef" component of an object returned by lm.ridge are the coefficients on a different scale. DR> and it does not seem to happen, why? Because the intercept is not penalised by lm.ridge and the non-constant columns of the design matrix are rescaled; hence the returned coefficients are on another scale. DR> Any help, suggestion or orientation? HTH. Cheers, Berwin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lm.ridge
Hello, I have posted this mail a few days ago but I did it wrong, I hope is right now: I have the following doubts related with lm.ridge, from MASS package. To show the problem using the Longley example, I have the following doubts: First: I think coefficients from lm(Employed~.,data=longley) should be equal coefficients from lm.ridge(Employed~.,data=longley, lambda=0) why it does not happen? Second: if I have for example Ridge<-lm.ridge(Employed~., data=longley, lambda = seq(0,0.1,0.001)), I suppose intercept coefficient is defined implicit, why it does not appear in Ridge$coef? Third: I suppose that if I define 1) y<-longley$Employed 2) X<-as.matrix(cbind(1,Longley[,1:6]) 3) I = identity matrix the following should be true: Coef=(X'X+kI)^(-1) X'y and if a take k=Ridge$kHKV, Coef should be approx equal to Ridge$Coef[near value of kHKV] and it does not seem to happen, why? Values: > Ridge$kHKB [1] 0.004275357 Using the calculation above (third question, third point): Coef= [,1] 1-0.095492310 GNP.deflator -0.052759002 GNP 0.070993540 Unemployed -0.004244391 Armed.Forces -0.005725582 Population -0.413341544 Year 0.048420107 And if I take from Ridge&coef: Ridge$coef[0.004] GNP.deflator -0.03098507 GNP -1.32553151 Unemployed -1.53237769 Armed.Forces -0.63334911 Population -0.88690241 Year 6.82105049 Any help, suggestion or orientation? Thanks in advance Daniel Rozengardt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lm.ridge again
Hello, I have posted this mail a few days ago without any answer: I have the following doubts related with lm.ridge, from MASS package. To show the problem using the Longley example, I have the following doubts: First: I think coefficients from lm(Employed~.,data=longley) should be equal coefficients from lm.ridge(Employed~.,data=longley, lambda=0) why it does not happen? Second: if I have for example Ridge<-lm.ridge(Employed~., data=longley, lambda = seq(0,0.1,0.001)), I suppose intercept coefficient is defined implicit, why it does not appear in Ridge$coef? Third: I suppose that if I define 1) y<-longley$Employed 2) X<-as.matrix(cbind(1,Longley[,1:6]) 3) I as the identity the following should be true: Coef=(X'X+kI)^(-1) X'y and if a take k=Ridge$kHKV, the coefficients should be approx equal to Ridge$Coef[near value of kHKV] and it does not seem to happen, why? Values: > Ridge$kHKB [1] 0.004275357 Using the calculation above (third question, third point): Coef= [,1] 1-0.095492310 GNP.deflator -0.052759002 GNP 0.070993540 Unemployed -0.004244391 Armed.Forces -0.005725582 Population -0.413341544 Year 0.048420107 And if I take from Ridge&coef: Ridge$coef[0.004] GNP.deflator -0.03098507 GNP -1.32553151 Unemployed -1.53237769 Armed.Forces -0.63334911 Population -0.88690241 Year 6.82105049 Any help, suggestion or orientation? Thanks in advance Daniel Rozengardt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html