Hi, everybody,
3 questions about R-square:
---------(1)----------- Does R2 always increase as variables are added?
---------(2)----------- Does R2 always greater than 1?
---------(3)----------- How is R2 in summary(lm(y~x-1))$r.squared
calculated? It is different from (r.square=sum((y.hat-mean
(y))^2)/sum((y-mean(y))^2))
I will illustrate these problems by the following codes:
---------(1)----------- R2 doesn't always increase as variables are added
> x=matrix(rnorm(20),ncol=2)
> y=rnorm(10)
>
> lm=lm(y~1)
> y.hat=rep(1*lm$coefficients,length(y))
> (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
[1] 2.646815e-33
>
> lm=lm(y~x-1)
> y.hat=x%*%lm$coefficients
> (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
[1] 0.4443356
>
> ################ This is the biggest model, but its R2 is not the biggest,
why?
> lm=lm(y~x)
> y.hat=cbind(rep(1,length(y)),x)%*%lm$coefficients
> (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
[1] 0.2704789
---------(2)----------- R2 can greater than 1
> x=rnorm(10)
> y=runif(10)
> lm=lm(y~x-1)
> y.hat=x*lm$coefficients
> (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
[1] 3.513865
---------(3)----------- How is R2 in summary(lm(y~x-1))$r.squared
calculated? It is different from (r.square=sum((y.hat-mean
(y))^2)/sum((y-mean(y))^2))
> x=matrix(rnorm(20),ncol=2)
> xx=cbind(rep(1,10),x)
> y=x%*%c(1,2)+rnorm(10)
> ### r2 calculated by lm(y~x)
> lm=lm(y~x)
> summary(lm)$r.squared
[1] 0.9231062
> ### r2 calculated by lm(y~xx-1)
> lm=lm(y~xx-1)
> summary(lm)$r.squared
[1] 0.9365253
> ### r2 calculated by me
> y.hat=xx%*%lm$coefficients
> (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
[1] 0.9231062
Thanks a lot for any cue:)
--
Junjie Li, [EMAIL PROTECTED]
Undergranduate in DEP of Tsinghua University,
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.