Re: [Rd] anova.lm fails with test=Cp

2011-05-10 Thread John Maindonald
For unknown sigma^2, a version that is a modification of AIC
may be preferred, i.e.

n log(RSS/n) + 2p - n

I notice that this is what is given in Maindonald and Braun (2010)
Data Analysis  Graphics Using R, 3rd edition.

Cf: Venables and Ripley, MASS, 4th edn, p.174.  VR do however
stop short of actually saying that Cp should be modified in the same
way as AIC when sigma^2 has to be estimated.  

Better still, perhaps,  give the AIC statistic.  This would make the
output consistent with dropterm(), drop1() and add1().  Or if Cp
is to stay, allow AIC as a a further test.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics  Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 08/05/2011, at 6:15 PM, peter dalgaard wrote:

 
 On May 8, 2011, at 09:25 , John Maindonald wrote:
 
 Here is an example, modified from the help page to use test=Cp:
 
 
 fit0 - lm(sr ~ 1, data = LifeCycleSavings)
 fit1 - update(fit0, . ~ . + pop15)
 fit2 - update(fit1, . ~ . + pop75)
 anova(fit0, fit1, fit2, test=Cp)
 Error in `[.data.frame`(table, , Resid. Dev) : 
 undefined columns selected
 
 Yes, the Resid. Dev column is only there in analysis of deviance tables. 
 For the lm() case, it looks like you should have RSS. 
 
 This has probably been there forever. Just goes to show how often people 
 use these things...
 
 Also, now that I'm looking at it, are we calculating it correctly in any 
 case? We have
 
   cbind(table, Cp = table[, Resid. Dev] + 2 * scale * 
   (n - table[, Resid. Df]))
 
 whereas all the references I can find have Cp=RSS/MS-N+2P, so the above would 
 actually be scale*Cp+N. 
 
 
 -- 
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] anova.lm fails with test=Cp

2011-05-08 Thread John Maindonald
Here is an example, modified from the help page to use test=Cp:


 fit0 - lm(sr ~ 1, data = LifeCycleSavings)
 fit1 - update(fit0, . ~ . + pop15)
 fit2 - update(fit1, . ~ . + pop75)
 anova(fit0, fit1, fit2, test=Cp)
Error in `[.data.frame`(table, , Resid. Dev) : 
 undefined columns selected
 sessionInfo()
R version 2.13.0 Patched (2011-04-28 r55678)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C  
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] tools_2.13.0

--

The help page says for  test: 
a character string specifying the test statistic to be used. 
Can be one of F, Chisq or Cp, with partial matching allowed, 
or NULL for no test.

test=Cp is, following the help page, intended to work?  Setting
the scale parameter does not help.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics  Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anova.lm fails with test=Cp

2011-05-08 Thread peter dalgaard

On May 8, 2011, at 09:25 , John Maindonald wrote:

 Here is an example, modified from the help page to use test=Cp:
 
 
 fit0 - lm(sr ~ 1, data = LifeCycleSavings)
 fit1 - update(fit0, . ~ . + pop15)
 fit2 - update(fit1, . ~ . + pop75)
 anova(fit0, fit1, fit2, test=Cp)
 Error in `[.data.frame`(table, , Resid. Dev) : 
 undefined columns selected

Yes, the Resid. Dev column is only there in analysis of deviance tables. For 
the lm() case, it looks like you should have RSS. 

This has probably been there forever. Just goes to show how often people use 
these things...

Also, now that I'm looking at it, are we calculating it correctly in any case? 
We have

cbind(table, Cp = table[, Resid. Dev] + 2 * scale * 
(n - table[, Resid. Df]))

whereas all the references I can find have Cp=RSS/MS-N+2P, so the above would 
actually be scale*Cp+N. 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel