Re: [R-sig-eco] Regression with few observations per factor level

V. Coudrain Mon, 20 Oct 2014 09:04:40 -0700

Yes, but as I fear, the residuals behave badly as soon as the model get a 
little bit more complex (e.g., with two covariables or an interactions). The 
scope for performing an ANCOVA is thus very limited. That's why I was thinking 
about a potential non-parametric model. But I do not want to artificially makes 
my data tell something if it cannot.





> Message du 20/10/14 à 16h50
> De : "stephen sefick" 
> A : "Martin Weiser" 
> Copie à : "V. Coudrain" , "r-sig-ecology" 
> Objet : Re: [R-sig-eco] Regression with few observations per factor level
> 
> You are more or less preforming an ANOVA/ANCOVA on your data? As pointed out 
> earlier, all of the normal theory regression assumptions apply. Assuming all 
> of those things are satisfied then if you have large confidence intervals and 
> there are significant differences between groups I don't see why you couldn't 
> correctly infer something about the treatments. Maybe I am missing something.
> Stephen 
> On Mon, Oct 20, 2014 at 8:43 AM, Martin Weiser  wrote:
> Hi,
> 
> coefficients and their p-values are reliable if your data are OK and you
> do know enough about the process that generated them, so you can choose
> appropriate model. With 4 points per line, it may be really difficult to
> identify bad fit or outliers.
> 
> For example: simple linear regression needs constant variance of the
> normal distribution from which residuals are drawn -  along the
> regression line - to work properly.  With 4 points, you can hardly
> estimate this, but if you know enough about the process that generated
> the data, you are safe. If you do not know, it is not easy to say
> anything about the nature of the process that generated the data.
> 
> If you know (or can assume) that there is simple linear relationship,
> you can say: "slope of this relationship is such and such", but if you
> want to estimate both the nature of the relationship ("A *linearly*
> depends on B") and its magnitude ("the slope of this relationship
> is ..."), p-values would not help you much.
> 
> Of course, I may be wrong - I am not a statistician, just a user.
> 
> Best,
> Martin W.
> 
> 
> V. Coudrain píše v Po 20. 10. 2014 v 13:37 +0200:
> > Thank you very much. If I get it right, the CI get wider, my test has less 
> > power and the probability of getting a significant relation decreases. What 
> > about the significant coefficients, are they reliable?
> >
> >
> >
> >
> > > Message du 20/10/14 à 11h30
> > > De : "Roman Luštrik"
> > > A : "V. Coudrain"
> > > Copie à : "r-sig-ecology@r-project.org"
> > > Objet : Re: [R-sig-eco] Regression with few observations per factor level
> > >
> > > I think you can, but the confidence intervals will be rather large due to 
> > > number of samples.
> > > Notice how standard errors change for sample size (per group) from 4 to 
> > > 30.
> > > > pg <- 4 # pg = per group> my.df <- data.frame(var = c(rnorm(pg, mean = 
> > > > 3), rnorm(pg, mean = 1), rnorm(pg, mean = 11), rnorm(pg, mean = 30)), + 
> > > >                     trt = rep(c("trt1", "trt2", "trt3", "trt4"), each = 
> > > > pg), +                     cov = runif(pg*4)) # 4 groups> 
> > > > summary(lm(var ~ trt + cov, data = my.df))
> > > Call:lm(formula = var ~ trt + cov, data = my.df)
> > > Residuals:     Min       1Q   Median       3Q      Max -1.63861 -0.46080  
> > > 0.03332  0.66380  1.27974
> > > Coefficients:            Estimate Std. Error t value Pr(>|t|)    
> > > (Intercept)   1.2345     1.0218   1.208    0.252    trttrt2      -0.7759  
> > >    0.8667  -0.895    0.390    trttrt3       7.8503     0.8308   9.449  
> > > 1.3e-06 ***trttrt4      28.2685     0.9050  31.236  4.3e-12 ***cov        
> > >    1.4027     1.1639   1.205    0.253    ---Signif. codes:  0 ‘***’ 0.001 
> > > ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > > Residual standard error: 1.154 on 11 degrees of freedomMultiple 
> > > R-squared:  0.9932,Adjusted R-squared:  0.9908 F-statistic: 404.4 on 4 
> > > and 11 DF,  p-value: 7.467e-12
> > > > > pg <- 30 # pg = per group> my.df <- data.frame(var = c(rnorm(pg, mean 
> > > > > = 3), rnorm(pg, mean = 1), rnorm(pg, mean = 11), rnorm(pg, mean = 
> > > > > 30)), +                     trt = rep(c("trt1", "trt2", "trt3", 
> > > > > "trt4"), each = pg), +                     cov = runif(pg*4)) # 4 
> > > > > groups> summary(lm(var ~ trt + cov, data = my.df))
> > > Call:lm(formula = var ~ trt + cov, data = my.df)
> > > Residuals:    Min      1Q  Median      3Q     Max -2.5778 -0.6584 -0.0185 
> > >  0.6423  3.2077
> > > Coefficients:            Estimate Std. Error t value Pr(>|t|)    
> > > (Intercept)  2.76961    0.25232  10.977  < 2e-16 ***trttrt2     -1.75490  
> > >   0.28546  -6.148 1.17e-08 ***trttrt3      8.40521    0.28251  29.752  < 
> > > 2e-16 ***trttrt4     27.04095    0.28286  95.599  < 2e-16 ***cov          
> > > 0.05129    0.32523   0.158    0.875    ---Signif. codes:  0 ‘***’ 0.001 
> > > ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > > Residual standard error: 1.094 on 115 degrees of freedomMultiple 
> > > R-squared:  0.9913,Adjusted R-squared:  0.991 F-statistic:  3269 on 4 and 
> > > 115 DF,  p-value: < 2.2e-16
> > > On Mon, Oct 20, 2014 at 10:53 AM, V. Coudrain  wrote:
> > > Hi, I would like to test the impact of a treatment of some variable using 
> > > regression (e.g. lm(var ~ trt + cov)).  However I only have four 
> > > observations per factor level. Is it still possible to apply a regression 
> > > with such a small sample size. I think that i should be difficult to 
> > > correctly estimate variance.Do you think that I rather should compute a 
> > > non-parametric test such as Kruskal-Wallis? However I need to include 
> > > covariables in my models and I am not sure if basic non-parametric tests 
> > > are suitable for this. Thanks for any suggestion.
> > > ___________________________________________________________
> > > Mode, hifi, maison,… J'achète malin. Je compare les prix avec
> > >         [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-ecology mailing list
> > > R-sig-ecology@r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> > >
> > >
> >
> > > --
> > > In God we trust, all others bring data.
> >
> > ___________________________________________________________
> > Mode, hifi, maison,… J'achète malin. Je compare les prix avec
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> 
> 
> 
> --
> 
> ------------------------------
> Pokud je tento e-mail součástí obchodního jednání, Přírodovědecká fakulta
> Univerzity Karlovy v Praze:
> a) si vyhrazuje právo jednání kdykoliv ukončit a to i bez uvedení důvodu,
> b) stanovuje, že smlouva musí mít písemnou formu,
> c) vylučuje přijetí nabídky s dodatkem či odchylkou,
> d) stanovuje, že smlouva je uzavřena teprve výslovným dosažením shody na
> všech náležitostech smlouvy.
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> 

> -- 
> Stephen Sefick
> **************************************************
> Auburn University                                         
> Biological Sciences                                      
> 331 Funchess Hall                                       
> Auburn, Alabama                                        
> 36849                                                           
> **************************************************
> sas0...@auburn.edu                                  
> http://www.auburn.edu/~sas0025                 
> **************************************************
> 
> Let's not spend our time and resources thinking about things that are so 
> little or so large that all they really do for us is puff us up and make us 
> feel like gods.  We are mammals, and have not exhausted the annoying little 
> problems of being mammals.
> 
>                                 -K. Mullis
> 
> "A big computer, a complex algorithm and a long time does not equal science."
> 
>                               -Robert Gentleman
> 
> 

___________________________________________________________
Mode, hifi, maison,… J'achète malin. Je compare les prix avec 
        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] Regression with few observations per factor level

Reply via email to