A much shorter (but complete) description of this is on the summary.lm help page. It includes the definitions R (and most statistics references) uses.
On Wed, 11 Jan 2006, Millo Giovanni wrote: > Alexandra, > some additional remarks taken from my past struggles with R2 :^) Without > intercept the definition is indeed problematic, as Bernhard notes. > > First, to estimate a model omitting the intercept you simply have to > specify "-1" in the model formula (example on an in-built dataset, for > data description see help(mtcars)): > >> data(mtcars) >> attach(mtcars) >> mod<-lm(mpg~hp+wt+qsec) # with intercept >> summary(mod) > > and > >> mod0<-lm(mpg~hp+wt+qsec-1) # without >> summary(mod0) > > The reported R2s are different not only in value (which is obvious) but > also in the definition. > In fact, there are 2 definitions of R2. With reference to the usual > analysis of variance in OLS regression (see e.g. Ch.3 in Greene 2003, > Econometric Analysis, and 3.5.2. in particular), let, in our example, > >> SST<-sum(mpg^2) # total sum of squares >> SSR<-sum(fitted(mod)^2) # regression sum of squares >> SSE<-sum(resid(mod)^2) # error sum of squares > > where (a) SST=SSR+SSE, as you may readily check, > then the *uncentered* R2 is defined as > >> uR2<-SSR/SST > > while the *centered* R2 as > >> cSST<-sum((mpg-mean(mpg))^2) >> cSSR<-sum((fitted(mod)-mean(mpg))^2) # as 1) mean(y)=mean(y_hat) >> cSSE<-sum(resid(mod)^2) # as 2) mean(e)=0 >> cR2<-cSSR/cSST > > and (b) cSST=cSSR+cSSE. > > The problem is that the meaning of R2 derives from decompositions (a) > and (b), but while (a) always holds for OLS models, (b) only holds for > models with an intercept (as do (1-2) above, on which it is based). Thus > *centered R2 is meaningless in models without intercept*. People are > used to cR2, though, so R reports cR2 for models with intercept, uR2 for > those without (EViews, e.g., reports cR2 for both). > Adjusted R2s are the same, adjusted by a factor penalizing for df. See > Greene, who gives > adjR2 = 1-(n-1)/(n-K)(1-R2) for n obs. and K regressors. > > Finally, it is of course feasible to calculate the model coefficients on > your own, but it would be inefficient (R has an optimized routine for > OLS, so you'd better use coef(lm(y~X))). Anyway, if you like, > >> y<-mpg # just for notational simplicity.. >> X<-cbind(hp,wt,qsec) # add rep(1,length(hp)) to this data matrix > # if you want an intercept > >> b<-solve(crossprod(X),crossprod(X,y)) # the coefficients for mod0 >> y_hat<-X%*%b # fitted values for y >> e<-y-y_hat # model residuals > > from which you can obtain anything you need. > > Cheers > Giovanni > > Giovanni Millo > Ufficio Studi > Assicurazioni Generali SpA > Via Machiavelli 4, 34131 Trieste (I) > tel. +39 040 671184 > fax +39 040 671160 > > ***************** > Original message: > > Date: Wed, 11 Jan 2006 09:16:46 -0000 > From: "Pfaff, Bernhard Dr." <[EMAIL PROTECTED]> > Subject: Re: [R] Obtaining the adjusted r-square given the regression > coef ficients > To: "'Alexandra R. M. de Almeida'" <[EMAIL PROTECTED]>, > r-help@stat.math.ethz.ch > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset="iso-8859-1" > > Hello Alexandra, > > R2 is only defined for regressions with intercept. See a decent > econometrics > textbook for its derivation. > > HTH, > Bernhard > > -----Urspr?ngliche Nachricht----- > Von: Alexandra R. M. de Almeida [mailto:[EMAIL PROTECTED] > Gesendet: Mittwoch, 11. Januar 2006 03:48 > An: r-help@stat.math.ethz.ch > Betreff: [R] Obtaining the adjusted r-square given the regression > coefficients > > Dear list > > I want to obtain the adjusted r-square given a set of coefficients > (without > the intercept), and I don't know if there is a function that does it. > Exist???????????????? > I know that if you make a linear regression, you enter the dataset and > have > in "summary" the adjusted r-square. But this is calculated using the > coefficients that R obtained,and I want other coefficients that i > calculated > separately and differently (without the intercept term too). > I have made a function based in the equations of the book "Linear > Regression > Analisys" (Wiley Series in probability and mathematical statistics), but > it > doesn't return values between 0 and 1. What is wrong???? > The functions is given by: > > > adjustedR2<-function(Y,X,saM) > { > if(is.matrix(Y)==F) (Y<-as.matrix(Y)) > if(is.matrix(X)==F) (X<-as.matrix(X)) > if(is.matrix(saM)==F) (saM<-as.matrix(saM)) > RX<-rent.matrix(X,1)$Rentabilidade.tipo > RY<-rent.matrix(Y,1)$Rentabilidade.tipo > r2m<-matrix(0,nrow=ncol(Y),ncol=1) > RSS<-matrix(0,ncol=ncol(Y),nrow=1) > SYY<-matrix(0,ncol=ncol(Y),nrow=1) > for (i in 1:ncol(RY)) > { > RSS[,i]<-(t(RY[,i])%*%RY[,i])-(saM[i,]%*%(t(RX)%*%RX)%*%t(saM)[,i]) > > SYY[,i]<-sum((RY[,i]-mean(RY[,i]))^2) > r2m[i,]<-1-(RSS[,i]/SYY[,i])*((nrow(RY))/(nrow(RY)-ncol(saM)-1)) > } > dimnames(r2m)<-list(colnames(Y),c("Adjusted R-square")) > return(r2m) > } > > > > Thanks! > Alexandra > > > > Alexandra R. Mendes de Almeida > > > > > > --------------------------------- > Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni ...{{dropped}} > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html