Re: [R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-26 Thread Rolf Turner


The test you are requesting is ***MEANINGLESS***.  The ``effect value'' 
of a single
level is ill-defined (or in the more usual parlance, not estimable).  
The dummy.coef()
procedure suggested by Gabor gives you point estimates *subject to the 
constraints*
imposed by the contrasts used.  The choice of contrasts is arbitrary, 
essentially a matter
of aesthetics/taste/convenience.  The values returned by dummy.coef() 
have, in and

of themselves, no meaning at all.

You can meaningfully estimate, and test for the significance of, 
*differences*
between the effect values of factor levels.   For the individual 
levels, no can do.


E.g.  Y = mu + alpha_i + E when the observation is at level i of the 
factor (and E
means random error. In this setting mu = 0, alpha_1 = 1, alpha_2 = 2 
and alpha_3

= 3 is ***EXACTLY THE SAME MODEL*** as mu = 1, alpha_1 = 0, alpha_2 = 1 and
alpha_3 = 2.

It makes no sense to ask (or to test) whether alpha_1 differs from 0.

cheers,

Rolf Turner

On 26/03/12 02:08, Biedermann, Jürgen wrote:

Hi Gabor,

Thanks a lot for the answer.
However, I'm not so much focusing on the pure effect value of the omitted 
factor level, but more on the statistical test if it
differs significantly from 0.
Do you know a way for this purpose too?

Greetings Jürgen

Von: Gabor Grothendieck [ggrothendi...@gmail.com]
Gesendet: Sonntag, 25. März 2012 14:11
An: Biedermann, Jürgen
Cc: r-help@R-project.org
Betreff: Re: [R] How to test omitted level from a multiple level factor against 
overall mean in regression models?

2012/3/25 Biedermann, Jürgenjuergen.biederm...@charite.de:

Hi there,

I have a linear model with one factor having three levels.
I want to check if the different levels significantly differ from the overall 
mean (using contr.sum).
However one level (the last) is omitted in the standard procedure.

To illustrate this:

x- as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
y- 
c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
test- data.frame(x,y)
reg1- lm(y~C(x,contr.sum),data=test)
summary(reg1)

Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept)   1.60.06577  24.834 8.48e-15 ***
C(x, contr.sum)1 -0.483330.10792  -4.479  0.00033 ***
C(x, contr.sum)2 -0.483330.08936  -5.409 4.70e-05 ***

Is it possible to get the effect for the third level (against the overall mean) 
in the table too.

I figured out:

reg2- lm(y~C(relevel(x,3),contr.sum),data=test)
summary(reg2)

C(relevel(x, 3), contr.sum)1  0.966670.07951  12.158 8.24e-10 ***
C(relevel(x, 3), contr.sum)2 -0.483330.10792  -4.479  0.00033 ***


The first row now test the third level against the overall mean, but I find 
this approach not so convenient.
Moreover, I wonder if it is meaningful at all regarding the cumulation of alpha 
error. Would a Bonferroni correction be sensible?


Try this:


options(contrasts = c(contr.sum, contr.poly))
reg1- lm(y~x,data=test)
dummy.coef(reg1)

Full coefficients are

(Intercept):  1.63
x:   1  2  3
 -0.483 -0.483  0.967

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Biedermann, Jürgen
Hi there,

I have a linear model with one factor having three levels.
I want to check if the different levels significantly differ from the overall 
mean (using contr.sum).
However one level (the last) is omitted in the standard procedure.

To illustrate this:

x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
y - 
c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
test - data.frame(x,y)
reg1 - lm(y~C(x,contr.sum),data=test)
summary(reg1)

Coefficients:
 Estimate Std. Error t value Pr(|t|)   
(Intercept)   1.60.06577  24.834 8.48e-15 ***
C(x, contr.sum)1 -0.483330.10792  -4.479  0.00033 ***
C(x, contr.sum)2 -0.483330.08936  -5.409 4.70e-05 ***

Is it possible to get the effect for the third level (against the overall mean) 
in the table too.

I figured out:

reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
summary(reg2)

C(relevel(x, 3), contr.sum)1  0.966670.07951  12.158 8.24e-10 ***
C(relevel(x, 3), contr.sum)2 -0.483330.10792  -4.479  0.00033 ***


The first row now test the third level against the overall mean, but I find 
this approach not so convenient.
Moreover, I wonder if it is meaningful at all regarding the cumulation of alpha 
error. Would a Bonferroni correction be sensible?

Greetings and thanks in advance
Jürgen
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Gabor Grothendieck
2012/3/25 Biedermann, Jürgen juergen.biederm...@charite.de:
 Hi there,

 I have a linear model with one factor having three levels.
 I want to check if the different levels significantly differ from the overall 
 mean (using contr.sum).
 However one level (the last) is omitted in the standard procedure.

 To illustrate this:

 x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
 y - 
 c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
 test - data.frame(x,y)
 reg1 - lm(y~C(x,contr.sum),data=test)
 summary(reg1)

 Coefficients:
                 Estimate Std. Error t value Pr(|t|)
 (Intercept)       1.6    0.06577  24.834 8.48e-15 ***
 C(x, contr.sum)1 -0.48333    0.10792  -4.479  0.00033 ***
 C(x, contr.sum)2 -0.48333    0.08936  -5.409 4.70e-05 ***

 Is it possible to get the effect for the third level (against the overall 
 mean) in the table too.

 I figured out:

 reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
 summary(reg2)

 C(relevel(x, 3), contr.sum)1  0.96667    0.07951  12.158 8.24e-10 ***
 C(relevel(x, 3), contr.sum)2 -0.48333    0.10792  -4.479  0.00033 ***


 The first row now test the third level against the overall mean, but I find 
 this approach not so convenient.
 Moreover, I wonder if it is meaningful at all regarding the cumulation of 
 alpha error. Would a Bonferroni correction be sensible?


Try this:

 options(contrasts = c(contr.sum, contr.poly))
 reg1 - lm(y~x,data=test)
 dummy.coef(reg1)
Full coefficients are

(Intercept):  1.63
x:   1  2  3
-0.483 -0.483  0.967

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Biedermann, Jürgen
Hi Gabor,

Thanks a lot for the answer. 
However, I'm not so much focusing on the pure effect value of the omitted 
factor level, but more on the statistical test if it
differs significantly from 0.
Do you know a way for this purpose too?

Greetings Jürgen

Von: Gabor Grothendieck [ggrothendi...@gmail.com]
Gesendet: Sonntag, 25. März 2012 14:11
An: Biedermann, Jürgen
Cc: r-help@R-project.org
Betreff: Re: [R] How to test omitted level from a multiple level factor against 
overall mean in regression models?

2012/3/25 Biedermann, Jürgen juergen.biederm...@charite.de:
 Hi there,

 I have a linear model with one factor having three levels.
 I want to check if the different levels significantly differ from the overall 
 mean (using contr.sum).
 However one level (the last) is omitted in the standard procedure.

 To illustrate this:

 x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
 y - 
 c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
 test - data.frame(x,y)
 reg1 - lm(y~C(x,contr.sum),data=test)
 summary(reg1)

 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)   1.60.06577  24.834 8.48e-15 ***
 C(x, contr.sum)1 -0.483330.10792  -4.479  0.00033 ***
 C(x, contr.sum)2 -0.483330.08936  -5.409 4.70e-05 ***

 Is it possible to get the effect for the third level (against the overall 
 mean) in the table too.

 I figured out:

 reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
 summary(reg2)

 C(relevel(x, 3), contr.sum)1  0.966670.07951  12.158 8.24e-10 ***
 C(relevel(x, 3), contr.sum)2 -0.483330.10792  -4.479  0.00033 ***


 The first row now test the third level against the overall mean, but I find 
 this approach not so convenient.
 Moreover, I wonder if it is meaningful at all regarding the cumulation of 
 alpha error. Would a Bonferroni correction be sensible?


Try this:

 options(contrasts = c(contr.sum, contr.poly))
 reg1 - lm(y~x,data=test)
 dummy.coef(reg1)
Full coefficients are

(Intercept):  1.63
x:   1  2  3
-0.483 -0.483  0.967

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.