Re: [R] Doing partial-f test for stepwise regression

2007-04-01 Thread Frank E Harrell Jr
Petr Klasterecky wrote:
> And what about to read the help page ?anova ...?
> 
>  >>>
> When given a sequence of objects, 'anova' tests the models against
>   one another in the order specified.
> <<<
> 
> Generally you almost never fit a full model (including all possible 
> interactions etc) - no one can interpret such complicated models. Anova 
> gives you a comparison between a broader model (the first argument to 
> anova) and its submodel(s).

True you might not fit a model with high-order interactions, but the 
full pre-specified model is the only one whose standard errors and test 
statistics work as advertised.

Frank

> 
> Petr
> 
> [EMAIL PROTECTED] napsal(a):
>> Hello all,
>> I am trying to figure out an optimal linear model by using stepwise
>> regression which requires partial f-test, I did some Googling on the
>> Internet and realised that someone seemed to ask the question before:
>>
>> Jim Milks <[EMAIL PROTECTED]> writes: 
>>> Dear all: 
>>>
>>> I have a regression model that has collinearity problems (between 
>>> three regressor variables). I need a F-test that will allow me to 
>>> compare between full (with all variables) and partial models (minus 
>>> 1=< variables). The general F-test formula I'm using is: 
>>>
>>> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / 
>>> MSS(full model) 
>>>
>>> Unfortunately, the ANOVA table parses the SS and MSS between the 
>>> variables and does not give the statistics for the regression model as 
>>> a whole, otherwise I'd do this by hand. 
>>>
>>> So, really, I have two questions: 1) Can I just add up all the SS and 
>>> MSS for all the variables to get the model SS and MSS and 2) Are 
>>> there any functions or packages I can use to calculate the F-statistic? 
>>> Just use anova(model1, model2). 
>>> (One potential catch: Make sure that both models are fitted to the same
>>> data set. Missing values in predictors may interfere.) 
>> However, in the answer provided by Mr. Peter Dalgaard,(use
>> anova(model1,model2) I could not understand what model1 and model2 are
>> supposed to referring to, which one is supposedly to be the full model and
>> which one is to be the partial model? Or it does not matter?
>>
>> Thanks in advance for help from anyone!
>>
>> Regards,
>> Anyi Zhu
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doing partial-f test for stepwise regression

2007-04-01 Thread rolf
Petr Klasterecky <[EMAIL PROTECTED]> wrote:

> And what about to read the help page ?anova ...?
> 
>  >>>
> When given a sequence of objects, 'anova' tests the models against
>   one another in the order specified.
> <<<

One perfectly reasonable response to ``what about'' is
that it is not *at all* clear as to what the statement
in the help page actually means.

> Generally you almost never fit a full model (including all possible 
> interactions etc) - no one can interpret such complicated models.

This assertion is certainly open to some dispute.

> Anova gives you a comparison between a broader model (the first
> argument to anova) and its submodel(s).

As I read the above statement, it seems you've got
it exactly backwards.  Broader model == full model,
submodel = model under the null hypothesis, is it
not so?

You should actually specify the ``reduced'' model (the model
under the null hypothesis) first, and the full model second.
E.g.:

 > y <- runif(20)
 > x1 <- runif(20)
 > x2 <- runif(20)
 > x3 <- runif(20)
 > f1 <- lm(y~x1+x2+x3)
 > f2 <- lm(y~x1)
 > anova(f1,f2)
Analysis of Variance Table

Model 1: y ~ x1 + x2 + x3
Model 2: y ~ x1
   Res.Df  RSS Df Sum of Sq  F Pr(>F)
1 16  0.93225   
2 18  1.07998 -2  -0.14774 1.2678 0.3083

Doing it your way --- full model first --- gives a
negative sum of squares.  And negative degrees of
freedom for the effect being tested.

Not that it really matters --- the anova() function
gives you the same F statistic and p-value either way.
And the negative SS is a dead giveaway that something
is a bit skew-wiff.

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doing partial-f test for stepwise regression

2007-04-01 Thread Michael Kubovy
On Apr 1, 2007, at 1:54 AM, [EMAIL PROTECTED] wrote:

> Hello all,
> I am trying to figure out an optimal linear model by using stepwise
> regression which requires partial f-test, I did some Googling on the
> Internet and realised that someone seemed to ask the question before:
>
> Jim Milks <[EMAIL PROTECTED]> writes:
>> Dear all:
>>
>> I have a regression model that has collinearity problems (between
>> three regressor variables). I need a F-test that will allow me to
>> compare between full (with all variables) and partial models (minus
>> 1=< variables). The general F-test formula I'm using is:
>>
>> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} /
>> MSS(full model)
>>
>> Unfortunately, the ANOVA table parses the SS and MSS between the
>> variables and does not give the statistics for the regression  
>> model as
>> a whole, otherwise I'd do this by hand.
>>
>> So, really, I have two questions: 1) Can I just add up all the SS and
>> MSS for all the variables to get the model SS and MSS and 2) Are
>> there any functions or packages I can use to calculate the F- 
>> statistic?
>> Just use anova(model1, model2).
>> (One potential catch: Make sure that both models are fitted to the  
>> same
>> data set. Missing values in predictors may interfere.)
>
> However, in the answer provided by Mr. Peter Dalgaard,(use
> anova(model1,model2) I could not understand what model1 and model2 are
> supposed to referring to, which one is supposedly to be the full  
> model and
> which one is to be the partial model? Or it does not matter?

You can tell which is which by looking at the degrees of freedom.

_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doing partial-f test for stepwise regression

2007-03-31 Thread Petr Klasterecky
And what about to read the help page ?anova ...?

 >>>
When given a sequence of objects, 'anova' tests the models against
  one another in the order specified.
<<<

Generally you almost never fit a full model (including all possible 
interactions etc) - no one can interpret such complicated models. Anova 
gives you a comparison between a broader model (the first argument to 
anova) and its submodel(s).

Petr

[EMAIL PROTECTED] napsal(a):
> Hello all,
> I am trying to figure out an optimal linear model by using stepwise
> regression which requires partial f-test, I did some Googling on the
> Internet and realised that someone seemed to ask the question before:
> 
> Jim Milks <[EMAIL PROTECTED]> writes: 
>> Dear all: 
>>
>> I have a regression model that has collinearity problems (between 
>> three regressor variables). I need a F-test that will allow me to 
>> compare between full (with all variables) and partial models (minus 
>> 1=< variables). The general F-test formula I'm using is: 
>>
>> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / 
>> MSS(full model) 
>>
>> Unfortunately, the ANOVA table parses the SS and MSS between the 
>> variables and does not give the statistics for the regression model as 
>> a whole, otherwise I'd do this by hand. 
>>
>> So, really, I have two questions: 1) Can I just add up all the SS and 
>> MSS for all the variables to get the model SS and MSS and 2) Are 
>> there any functions or packages I can use to calculate the F-statistic? 
>> Just use anova(model1, model2). 
>> (One potential catch: Make sure that both models are fitted to the same
>> data set. Missing values in predictors may interfere.) 
> 
> However, in the answer provided by Mr. Peter Dalgaard,(use
> anova(model1,model2) I could not understand what model1 and model2 are
> supposed to referring to, which one is supposedly to be the full model and
> which one is to be the partial model? Or it does not matter?
> 
> Thanks in advance for help from anyone!
> 
> Regards,
> Anyi Zhu
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Doing partial-f test for stepwise regression

2007-03-31 Thread zhuanyi
Hello all,
I am trying to figure out an optimal linear model by using stepwise
regression which requires partial f-test, I did some Googling on the
Internet and realised that someone seemed to ask the question before:

Jim Milks <[EMAIL PROTECTED]> writes: 
> Dear all: 
> 
> I have a regression model that has collinearity problems (between 
> three regressor variables). I need a F-test that will allow me to 
> compare between full (with all variables) and partial models (minus 
> 1=< variables). The general F-test formula I'm using is: 
> 
> F = {[SS(full model) - SS(reduced model)] / (#variables taken out)} / 
> MSS(full model) 
> 
> Unfortunately, the ANOVA table parses the SS and MSS between the 
> variables and does not give the statistics for the regression model as 
> a whole, otherwise I'd do this by hand. 
> 
> So, really, I have two questions: 1) Can I just add up all the SS and 
> MSS for all the variables to get the model SS and MSS and 2) Are 
> there any functions or packages I can use to calculate the F-statistic? 
>Just use anova(model1, model2). 
>(One potential catch: Make sure that both models are fitted to the same
>data set. Missing values in predictors may interfere.) 

However, in the answer provided by Mr. Peter Dalgaard,(use
anova(model1,model2) I could not understand what model1 and model2 are
supposed to referring to, which one is supposedly to be the full model and
which one is to be the partial model? Or it does not matter?

Thanks in advance for help from anyone!

Regards,
Anyi Zhu

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.