I am puzzled by the computation of R^2 with intercept omitted that is already illustrated by the following example taken from help("lm")
## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2, 10, 20, labels = c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) lm.D90 <- lm(weight ~ group - 1) # omitting intercept The calculations for the R^2 for both models are consistent with the help("summary.lm") description: "y* is the mean of y[i] if there is an intercept and zero otherwise." Which causes a dramatic difference in the resulting R^2 values. r2.D9 <- summary(lm.D9)$r.squared r2.D90 <- summary(lm.D90)$r.squared all.equal(r2.D9, 0.0730775989903856) #TRUE all.equal(r2.D90, 0.981783272435264) #TRUE This is counter-intuitive to say the least since the two models have identical predictions and both models could be described more accurately as two intercepts rather than zero. I see three possibilities: 1. This is the intended result, in which case no fix is required, but I’d be curious to understand the argument better. 2. This is an unfortunate outcome but not worth fixing as the user can easily compute the correct R^2. In this case, I'd suggest that this unintuitive behavior should be explicitly called out in the documentation. 3. This is a bug worth fixing. I look forward to hearing the community’s opinion on this. Thanks in advance! ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.