[R] ANCOVA

2007-01-06 Thread John Cardinale
Hello,

I am analyzing a data set that has possible covariates.  It is also
multivariate.  Are there any R function which can do analysis of covariance?

Thanks.

John Cardinale

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANCOVA

2007-01-06 Thread Michael Kubovy
On Jan 6, 2007, at 8:34 AM, John Cardinale wrote:

 Are there any R function which can do analysis of covariance?

?lm
RSiteSearch('ancova')
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANCOVA

2007-01-06 Thread Ramon Diaz-Uriarte
On 1/6/07, Michael Kubovy [EMAIL PROTECTED] wrote:
 On Jan 6, 2007, at 8:34 AM, John Cardinale wrote:

  Are there any R function which can do analysis of covariance?

 ?lm
 RSiteSearch('ancova')



Given the question, you'll probably need to find how to do an ancova
with lm. Several documents in
http://cran.r-project.org/other-docs.html will show you how (and why
ancova is just one special case of linear model). In particular, I
think Faraway's Practical regression and Anova using R has explicit
chapters/sections for Ancova. Many of other standard texts on R/S do
too.

R.






 _
 Professor Michael Kubovy
 University of Virginia
 Department of Psychology
 USPS: P.O.Box 400400Charlottesville, VA 22904-4400
 Parcels:Room 102Gilmer Hall
  McCormick RoadCharlottesville, VA 22903
 Office:B011+1-434-982-4729
 Lab:B019+1-434-982-4751
 Fax:+1-434-982-4766
 WWW:http://www.people.virginia.edu/~mk9y/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANCOVA

2007-01-06 Thread Richard M. Heiberger
Please look at the help file
?ancova
in the HH package.

Please get HH_1.17 which was up on CRAN yesterday.  The source
file has propagated to the mirrors.  As of a few minutes ago,
the Windows binary is on cran.at.r-project.org but not yet at the mirrors.

Be sure to have history recording on in the graphics window
because it will be very helpful to toggle between the displays
for the different models.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANCOVA in S-plus/R?

2006-06-02 Thread Cindy Yang
Dear R user:
   
  I have a question about doing ANCOVA in S-plus or R. 
   
  I know that many users use lm to do the regression and check the ANCOVA. But 
is there a way to get the traditional Table form of the ANCOVA test through 
S-plus (like what we would get from SPSS or SAS)?
   
  The problem I’m interested in is whether or not there is a treatment effect 
on some medical measurement. I will have the endpoint (E) and the baseline (B) 
of the measurement,   together with a dummy variable (T) of what treatment the 
patients get (control, or new). I’m thinking of using lm( (E-B)  ~ B + T, data 
= ..)  
   
  Question 1: if I get the regression results (E-B) = b0 + b1*B + b2*T and in 
order to check whether or not there is a treatment effect, all I need to do is 
to check whether the p-value associated with b2 is less than the significant 
level I set or there’s more I need to do?
   
  Question 2: If I put  anova( lm( (E-B)  ~ B + T, data = ..) ), is the table I 
get here the one I want for an ANCOVA? My concern is that ANCOVA is supposed to 
work only on residual compared to ANOVA, but we are using the same command 
here. If this gives me the ANCOVA table, does the order of the variables in 
regression matter? Will 
  anova( lm( (E-B)  ~  T + B, data = ..) ) gives the same result?
   
  Question 3: if I have another dummy variable to indicate the location of 
hospital each patient goes to (L)  and want to include it to my model, should I 
do
   lm( (E-B)  ~ B + T + L, data = ..)   or  I need to consider the interactions?
   
  Thanks a lot!
   
  Cindy

 __



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ANCOVA in S-plus/R?

2006-06-02 Thread Richard M. Heiberger
To get the ANCOVA table you need to look at the anova() function.

The variables T and L must be factors to get the multi-degree-of-freedom
anova tables you are looking for.

Order matters.  You get the same residual, but the sequential sums of 
squares differ.

bt.aov - aov(E ~ B + T)
anova(bt.aov)
gives you the (t-1)df test of common intercept vs variable intercept.
This is equivalent to comparing the adjusted means.


tb.aov - aov(E ~ T + B)
anova(tb.aov)
gives you the 1df test of 0 slope (ordinary ANOVA) vs non-zero common slope.

lm() and aov() are essentially the same program.  The default output
is different.


When you bring in the blocking effect of hospital, you will most
likely want that sequentially first.

lbt.aov - aov(E ~ L + B + T)
anova(lbt.aov)

If you want to test for interaction with the hospital effect, you can use

l.bt.aov - aov(E ~ L * (B + T))
anova(l.bt.aov)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANCOVA, Ops.factor, singular fit???

2006-05-20 Thread Peter Dalgaard
[EMAIL PROTECTED] writes:

 I'm trying to perform ANCOVAs in R 1.14, on a Mac OS X, but I can't figure out

?! There's no version 1.14 of R.

 what I am doing wrong.  Essentially, I'm testing whether a number of
 quantitative dental measurements (the response variables in each ANCOVA) show
 sexual dimorphism (the sexes are the groups) independently of the animal's 
 size
 (the concomitant variable).  I have attached a 13-column matrix as a data 
 frame
 (so far, so good).  But then I tried to do this:
 
 model-lm(ln2~sex*ln1)
 
 or this:
 
 model-lm(ln2~sex+ln1)
 
 and got this:
 
 Warning message:
 - not meaningful for factors in: Ops.factor(y, z$residuals)
 
 which I don't understand.  (In my matrix, ln2 is the name of the second 
 column,
 a response variable, and ln1 is the name of the first column, a concomitant
 variable.  Sex is the rightmost column, indicating sex.  The first 14 rows are
 measurements for male individuals, and the next 13 rows are measurements for
 female individuals.)
 
 The data output is bizarre, too--it's just so long, and everything begins with
 ln 11 or ln 12.  How can I fix this?

My best guess is that you have a data error so that ln1 and ln2 are
not read as numeric variables. Nonstandard codes for missing will do
that to you, for instance.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANCOVA, Ops.factor, singular fit???

2006-05-20 Thread Marc Schwartz
On Sat, 2006-05-20 at 08:36 +0200, Peter Dalgaard wrote:
 [EMAIL PROTECTED] writes:
 
  I'm trying to perform ANCOVAs in R 1.14, on a Mac OS X, but I can't figure 
  out
 
 ?! There's no version 1.14 of R.

Looking at the Mac page on CRAN, I suspect that this is the GUI version
number, rather than the R version number.

This has confused me as well in prior posts.  I don't use a Mac, so am
unclear as to how this particular version number is obtained. Is there a
Help - About type of menu on the Mac GUI where this value is displayed?

Not sure if this would be considered user error, or if the GUI is
unclear as to differing version numbering between R and the other
components on Macs.

  what I am doing wrong.  Essentially, I'm testing whether a number of
  quantitative dental measurements (the response variables in each ANCOVA) 
  show
  sexual dimorphism (the sexes are the groups) independently of the animal's 
  size
  (the concomitant variable).  I have attached a 13-column matrix as a data 
  frame

This may be the problem. If the data source is a matrix, which can of
course only contain a single data type, the inclusion of a gender
column, which is presumably textual, will force all columns to text in
the matrix and then to factors in the data frame:

 ln1 - 1:5
 ln2 - 1:5
 sex - c(Male, Female, Female, Male, Male)

 mat - cbind(ln1, sex, ln2)

 mat
 ln1 sex  ln2
[1,] 1 Male   1
[2,] 2 Female 2
[3,] 3 Female 3
[4,] 4 Male   4
[5,] 5 Male   5

 str(mat)
 chr [1:5, 1:3] 1 2 3 4 5 Male Female Female Male ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:3] ln1 sex ln2


Note that all columns in 'mat' are now character vectors.


 DF - as.data.frame(mat)

 DF
  ln1sex ln2
1   1   Male   1
2   2 Female   2
3   3 Female   3
4   4   Male   4
5   5   Male   5

 str(DF)
`data.frame':   5 obs. of  3 variables:
 $ ln1: Factor w/ 5 levels 1,2,3,4,..: 1 2 3 4 5
 $ sex: Factor w/ 2 levels Female,Male: 2 1 1 2 2
 $ ln2: Factor w/ 5 levels 1,2,3,4,..: 1 2 3 4 5


Note that all columns are factors, the default behavior of converting
character vectors into a data frame.


Now the model with interactions:

 model - lm(ln2 ~ sex * ln1, data = DF)
Warning message:
- not meaningful for factors in: Ops.factor(y, z$residuals)

 summary(model)

Call:
lm(formula = ln2 ~ sex * ln1, data = DF)

Residuals:
ALL 5 residuals are 0: no residual degrees of freedom!

Coefficients: (5 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)
(Intercept) 3 NA  NA   NA
sexMale-2 NA  NA   NA
ln12   -1 NA  NA   NA
ln13   NA NA  NA   NA
ln143 NA  NA   NA
ln154 NA  NA   NA
sexMale:ln12   NA NA  NA   NA
sexMale:ln13   NA NA  NA   NA
sexMale:ln14   NA NA  NA   NA
sexMale:ln15   NA NA  NA   NA

Residual standard error: NA on 0 degrees of freedom
Multiple R-Squared:NA,  Adjusted R-squared:NA
F-statistic:NA on 4 and 0 DF,  p-value: NA

Warning message:
^ not meaningful for factors in: Ops.factor(r, 2)



The long output that you refer to is presumably the multitude of terms
and interactions at the various levels of the three factors.

How did you read in the data initially?  You need to be careful to
maintain the data types, as Peter notes below. Reviewing the R
Import/Export Manual may be helpful.

HTH,

Marc Schwartz

  (so far, so good).  But then I tried to do this:
  
  model-lm(ln2~sex*ln1)
  
  or this:
  
  model-lm(ln2~sex+ln1)
  
  and got this:
  
  Warning message:
  - not meaningful for factors in: Ops.factor(y, z$residuals)
  
  which I don't understand.  (In my matrix, ln2 is the name of the second 
  column,
  a response variable, and ln1 is the name of the first column, a concomitant
  variable.  Sex is the rightmost column, indicating sex.  The first 14 rows 
  are
  measurements for male individuals, and the next 13 rows are measurements for
  female individuals.)
  
  The data output is bizarre, too--it's just so long, and everything begins 
  with
  ln 11 or ln 12.  How can I fix this?
 
 My best guess is that you have a data error so that ln1 and ln2 are
 not read as numeric variables. Nonstandard codes for missing will do
 that to you, for instance.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ANCOVA, Ops.factor, singular fit???

2006-05-19 Thread rebecca . sealfon
I'm trying to perform ANCOVAs in R 1.14, on a Mac OS X, but I can't figure out
what I am doing wrong.  Essentially, I'm testing whether a number of
quantitative dental measurements (the response variables in each ANCOVA) show
sexual dimorphism (the sexes are the groups) independently of the animal's size
(the concomitant variable).  I have attached a 13-column matrix as a data frame
(so far, so good).  But then I tried to do this:

model-lm(ln2~sex*ln1)

or this:

model-lm(ln2~sex+ln1)

and got this:

Warning message:
- not meaningful for factors in: Ops.factor(y, z$residuals)

which I don't understand.  (In my matrix, ln2 is the name of the second column,
a response variable, and ln1 is the name of the first column, a concomitant
variable.  Sex is the rightmost column, indicating sex.  The first 14 rows are
measurements for male individuals, and the next 13 rows are measurements for
female individuals.)

The data output is bizarre, too--it's just so long, and everything begins with
ln 11 or ln 12.  How can I fix this?

I have another question: If possible I would like to use a robust fit algorithm
to fit the data.  When I attempt to do this, substituting rlm() for lm(), the
program returns another message:

Error in rlm.default(x, y, weights, method = method, wt.method = wt.method,  :
'x' is singular: singular fits are not implemented in rlm

What is a singular fit and why is it, apparently, undesirable?  I am new to R
and any help would be greatly appreciated.

Thanks so much,
Rebecca Sealfon

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANCOVA with random factor

2006-03-20 Thread Spencer Graves
  I don't understand:  I just ran the first example in the aov help 
page, and it produced F ratios and p values.

  If this does not answer your question, I suggest you read Pinheiro 
and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer) if you 
haven't already and try to use lme for what you want.  If all your 
applications are perfectly balanced, then aov may be fine for you. 
However, the function aov codified routine practice for this kind of 
problem as of ~1980.  However, there have been substantive developments 
in this area since then, and Pinheiro and Bates is the best reference I 
know on both the theory and the application -- as of 2000.

  hope this helps.
  spencer graves

David Semmens wrote:

 I would like to know if there is a way of directly calculating the 
 F-ratio of a random effect using the aov function. I have 2 factors 
 in my model, population which is random and length which is the 
 length of female fish within each population. The dependent variable is 
 diam which is the average diameter of eggs produced by each female. 
 At present I set up the model like this:
 
  model - aov(diam~population*length, data=data)
  anova(model)
 
 then using the output:
 
  ratio=MS-population/MS-interaction
  1-pf(ratio, df-population,df-interaction)
 
 Is there a way of getting r to calculate the correct F-ratio for a 
 random factor and present it in the original output?
 
 Thanks,
 David Semmens.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ANCOVA with random factor

2006-03-15 Thread David Semmens
I would like to know if there is a way of directly calculating the 
F-ratio of a random effect using the aov function. I have 2 factors 
in my model, population which is random and length which is the 
length of female fish within each population. The dependent variable is 
diam which is the average diameter of eggs produced by each female. 
At present I set up the model like this:

 model - aov(diam~population*length, data=data)
 anova(model)

then using the output:

 ratio=MS-population/MS-interaction
 1-pf(ratio, df-population,df-interaction)

Is there a way of getting r to calculate the correct F-ratio for a 
random factor and present it in the original output?

Thanks,
David Semmens.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANCOVA Post-hoc test

2005-12-18 Thread Spencer Graves
  What search terms did you use?  Have you considered the multcomp
and multtest packages?

  spencer graves
p.s.  If you'd like more help from this list, I suggest you read the
posting guide! www.R-project.org/posting-guide.html.  Anecdotal
evidence suggests that posts more closely aligned with this guide tend
to get quicker, more useful replies.

Nicolas Poulet wrote:

 Hello,
 
 Despite my search, I didn't find a post-hoc test for an ANCOVA.
 
 I used the functions aov() and lm() to run the ANCOVA then I tried
 TukeyHSD() but it didn't work (because of the covariable is a continuous
 variable?).
 
 Furthermore, I would like to plot the adjusted values (i.e. the values of
 the tested variable taking into account the covariable).
 
 Thanks for your help!
 
 N. Poulet
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ANCOVA Post-hoc test

2005-12-14 Thread Nicolas Poulet
Hello,

Despite my search, I didn't find a post-hoc test for an ANCOVA.

I used the functions aov() and lm() to run the ANCOVA then I tried
TukeyHSD() but it didn't work (because of the covariable is a continuous
variable?).

Furthermore, I would like to plot the adjusted values (i.e. the values of
the tested variable taking into account the covariable).

Thanks for your help!

N. Poulet


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Ancova and lme use

2005-12-07 Thread Spencer Graves
Mon cher M. MENICACCI:

  It looks to me like you ultimately want to use lmer in 
library(lme4) [which also requires library(Matrix)].  For documentation, 
I suggest you start with Doug Bates (2005) Fitting Linear Mixed Models 
in R, R News, vol. 5/1: 27-30 (available from www.r-project.org - 
Newsletter).  After install.packages(lme4), I suggest you read 
Implementation.pdf in ~R\library\lme4\doc.  I also suggest you get 
Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus 
(Springer).  For me, this book was essential documentation for lme, 
the previous implementation of lmer.  Studying that book might help 
you understand lmer.

  Also, I encourage you to use the extensive random number generation 
capabilities in R (including the nlme and lme4 packages) to produce 
simulated data like you expect to collect and try to analyze the 
simulated data.  You should simulate both what you expect to see and the 
null hypothesis as well.  If you encounter difficulties doing that, 
please submit another question to this listserve.  Before submitting 
another post, I suggest you help yourself by reading the posting guide! 
www.R-project.org/posting-guide.html.  Anecdotal evidence suggests 
that posts that are more consistent with this posting guide generally 
get more useful replies quicker.

  bon chance.
  spencer graves

[EMAIL PROTECTED] wrote:

 
 
 
 Dear R-users,
 
 We expect to develop statistic procedures and environnement for the
 computational analysis of our experimental datas. To provide a proof of
 concept, we plan to implement a test for a given experiment.
 
 Its design split data into 10 groups (including a control one) with 2
 mesures for each (ref at t0 and response at t1). We aim to compare each
 group response with control response (group 1) using a multiple comparison
 procedure (Dunnett test).
 
 Before achieving this, we have to normalize our data : response values
 cannot be compared if base line isn't corrected. Covariance analysis seems
 to represent the best way to do this. But how to perform this by using R ?
 
 Actually, we have identify some R functions of interest regarding this
 matter (lme(), lm() and glm()).
 
 For example we plan to do as describe :
 glm(response~baseline) and then simtest(response_corrected~group,
 type=Dunnett, ttype=logical)
 If a mixed model seems to better fit our experiment, we have some problems
 on using the lme function : lme(response~baseline) returns an error
 (Invalid formula for groups).
 
 So :
 Are fitted values represent the corrected response ?
 Is it relevant to perform these tests in our design ?
 And how to use lme in a glm like way ?
 
 If someone could bring us your its precious knowledge to validate our
 analytical protocol and to express its point of view on implementation
 strategy ?
 
 Best regards.
 
 
 Alexandre MENICACCI
 Bioinformatics - FOURNIER PHARMA
 50, rue de Dijon - 21121 Daix - FRANCE
 [EMAIL PROTECTED]
 tél : 03.80.44.76.17
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Ancova and lme use

2005-12-02 Thread a . menicacci




Dear R-users,

We expect to develop statistic procedures and environnement for the
computational analysis of our experimental datas. To provide a proof of
concept, we plan to implement a test for a given experiment.

Its design split data into 10 groups (including a control one) with 2
mesures for each (ref at t0 and response at t1). We aim to compare each
group response with control response (group 1) using a multiple comparison
procedure (Dunnett test).

Before achieving this, we have to normalize our data : response values
cannot be compared if base line isn't corrected. Covariance analysis seems
to represent the best way to do this. But how to perform this by using R ?

Actually, we have identify some R functions of interest regarding this
matter (lme(), lm() and glm()).

For example we plan to do as describe :
glm(response~baseline) and then simtest(response_corrected~group,
type=Dunnett, ttype=logical)
If a mixed model seems to better fit our experiment, we have some problems
on using the lme function : lme(response~baseline) returns an error
(Invalid formula for groups).

So :
Are fitted values represent the corrected response ?
Is it relevant to perform these tests in our design ?
And how to use lme in a glm like way ?

If someone could bring us your its precious knowledge to validate our
analytical protocol and to express its point of view on implementation
strategy ?

Best regards.


Alexandre MENICACCI
Bioinformatics - FOURNIER PHARMA
50, rue de Dijon - 21121 Daix - FRANCE
[EMAIL PROTECTED]
tél : 03.80.44.76.17

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] ANCOVA

2004-08-28 Thread John Fox
Dear Matt,

The sequential sums of squares produced by anova() test for g ignoring x
(and the interaction), x after g (and ignoring the interaction), and the x:g
interaction after g and x. The second and third test are generally sensible,
but the first doesn't adjust for x, which is probably not what you want in
general (although you've constructed x to be independent of g, which is
typically not the case). Note that you would not usually want to test for
equal intercepts in the model that doesn't constrain the slopes to be equal
(though you haven't done this): Among other things, 0 is outside the range
of x.

The Anova() function in the car package will produce so-called type-II
tests.

I hope that this helps.
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Matt Oliver
 Sent: Friday, August 27, 2004 4:10 PM
 To: [EMAIL PROTECTED]
 Subject: [R] ANCOVA
 
 Dear R-help list,
 
 I am attempting to understand the proper formulation of 
 ANCOVA's in R. I would like to test both parallelism and 
 intercept equality for some data sets, so I have generated an 
 artificial data set to ease my understanding.
 
 This is what I have done
 
 #Limits of random error added to vectors min - -0.1 max - 0.1
 
 x - c(c(1:10), c(1:10))+runif(20, min, max)
 x1 - c(c(1:10), c(1:10))+runif(20, min, max) y - c(c(1:10), 
 c(10:1))+runif(20, min, max) z - c(c(1:10), 
 c(11:20))+runif(20, min, max) g - as.factor(c(rep(1, 10), 
 rep(2, 10)))
 
 #x and x1 have similar slopes and have the similar 
 intercepts, #x and y have different slopes and different 
 intercepts #x and z have similar slopes with different intercepts
 
 #These are my full effects models
 
 fitx1x - lm(x1~g+x+x:g)
 fityx - lm(y~g+x+x:g)
 fitzx - lm(z~g+x+x:g)
 
 
 anova(fitx1x)
 
 Analysis of Variance Table
 
 Response: x1
  Df   Sum Sq  Mean Sq F 
 value   Pr(F)
 g   1 0.002   0.002  0.3348 
 0.5709
 x   1 163.927  163.927  23456.8319 
 2e-16 ***
 g:x1  0.002  0.002   0.2671 
 0.6123
 Residuals 16  0.112  0.007
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and x1 do not have significantly 
 different means with respect to g; There is a significant 
 linear relationship between x and x1 independent of g There 
 is no evidence that the slopes of x and x1 in the two g 
 groups is different
 
 anova(fityx)
 
 Analysis of Variance Table
 
 Response: y
  Df   Sum Sq  Mean Sq F 
 value  Pr(F)
 g   1   0.012  0.012  
   1.7344 
 0.2064
 x   1   0.003  0.003  
   0.4399 
 0.5166
 g:x 1  164.947   164.947  
 24274.4246 2e-16 ***
 Residuals   160.1090.007
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and y do not have significantly 
 different means with respect to g; There is a not a 
 significant linear relationship between x and y independent 
 of g There is evidence that the slopes of x and y in the two 
 g groups is significantly different.
 
 
 anova(fitzx)
 
 Analysis of Variance Table
 
 Response: z
 DfSum Sq  Mean Sq  F 
 value  Pr(F)
 g  1  501.07501.07
 52709.9073  2e-16 ***
 x  1  165.39165.39
 17398.4057  2e-16 ***
 g:x10.02  0.02 1.7472 
 0.2048
 Residuals 160.15  0.01
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and z  have significantly 
 different means with respect to g; There is a a significant 
 linear relationship between x and z independent of g There is 
 no evidence that the slopes of x and z in the two g groups is 
 significantly different.
 
 
 What I don't understand is how to formulate the model so that 
 I can tell if the intercepts between the g groups are different.
 Also, how would I formulate an ANCOVA if I am dealing with 
 Model II regressions?
 
 
 Any help would be greatly appreciated.
 
 Matt
 
 
 
 ==
 When you reach an equilibrium in biology, you're dead. - A. 
 Mandell == Matthew J. Oliver 
 Institute of Marine and Coastal Sciences
 71 Dudley Road, New Brunswick
 New Jersey, 08901
 http://marine.rutgers.edu/cool/
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

[R] ANCOVA

2004-08-27 Thread Matt Oliver
Dear R-help list,
I am attempting to understand the proper formulation of ANCOVA's in R. I 
would like to test both parallelism and intercept equality for some data 
sets, so I have generated an artificial data set to ease my understanding.

This is what I have done
#Limits of random error added to vectors
min - -0.1
max - 0.1
x - c(c(1:10), c(1:10))+runif(20, min, max)
x1 - c(c(1:10), c(1:10))+runif(20, min, max)
y - c(c(1:10), c(10:1))+runif(20, min, max)
z - c(c(1:10), c(11:20))+runif(20, min, max)
g - as.factor(c(rep(1, 10), rep(2, 10)))
#x and x1 have similar slopes and have the similar intercepts,
#x and y have different slopes and different intercepts
#x and z have similar slopes with different intercepts
#These are my full effects models
fitx1x - lm(x1~g+x+x:g)
fityx - lm(y~g+x+x:g)
fitzx - lm(z~g+x+x:g)
anova(fitx1x)
Analysis of Variance Table
Response: x1
Df   Sum Sq  Mean Sq F 
value   Pr(F)
g   1 0.002   0.002  0.3348 
0.5709
x   1 163.927  163.927  23456.8319 
2e-16 ***
g:x1  0.002  0.002   0.2671 
0.6123
Residuals 16  0.112  0.007
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

These results confirm that x and x1 do not have significantly different 
means with respect to g;
There is a significant linear relationship between x and x1 independent of g
There is no evidence that the slopes of x and x1 in the two g groups is 
different

anova(fityx)
Analysis of Variance Table
Response: y
Df   Sum Sq  Mean Sq F 
value  Pr(F)
g   1   0.012  0.0121.7344 
0.2064
x   1   0.003  0.0030.4399 
0.5166
g:x 1  164.947   164.947  24274.4246 2e-16 ***
Residuals   160.1090.007
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

These results confirm that x and y do not have significantly different 
means with respect to g;
There is a not a significant linear relationship between x and y 
independent of g
There is evidence that the slopes of x and y in the two g groups is 
significantly different.

anova(fitzx)
Analysis of Variance Table
Response: z
   DfSum Sq  Mean Sq  F value  Pr(F)
g  1  501.07501.0752709.9073  2e-16 ***
x  1  165.39165.3917398.4057  2e-16 ***
g:x10.02  0.02 1.7472 
0.2048
Residuals 160.15  0.01
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

These results confirm that x and z  have significantly different means with 
respect to g;
There is a a significant linear relationship between x and z independent of g
There is no evidence that the slopes of x and z in the two g groups is 
significantly different.

What I don't understand is how to formulate the model so that I can tell if 
the intercepts between the g groups are different.
Also, how would I formulate an ANCOVA if I am dealing with Model II 
regressions?

Any help would be greatly appreciated.
Matt

==
When you reach an equilibrium in biology,
you're dead. - A. Mandell
==
Matthew J. Oliver
Institute of Marine and Coastal Sciences
71 Dudley Road, New Brunswick
New Jersey, 08901
http://marine.rutgers.edu/cool/
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ANCOVA when you don't know factor levels

2004-03-17 Thread Rob Knell
Hello people

I am doing some thinking about how to analyse data on dimorphic animals  
- where different individuals of the same species have rather different  
morphology. An example of this is that some male beetles have large  
horns and small wings, and rely on beating the other guys up to get  
access to mates, whereas others have smaller horns and larger wings,  
and rely on mobility to find mates before the guys with the big horns  
turn up and beat them up. Normally what we do to see if this is  
happening is to plot a scatter plot of horn length vs body size. If  
it's a simple straight line relationship then no dimorphism, but if  
it's either a line with an obvious change of slope or two separate  
lines then you've got a dimorphic animal.

If you've just got a change of slope then it's relatively easy to test  
for significance by breakpoint regression, which will also tell you the  
body size at which the beetles switch from 'big wings, small horns'  
strategy to the other. What is more problematic, however, is if you  
have a dataset that is essentially two linear regressions that overlap  
in both X and Y. Here you can't do a breakpoint regression. The  
solution that I have come up with is to put a straight line between the  
two clouds of data and to use that line to define a two-level factor  
and fit a model with body size, the factor and the interaction to the  
horn size data. The line can then be moved up and down, and the slope  
varied, and the fit of the model for each intercept/slope combination  
compared. The best fit model can then be compared with a straight  
linear model with an F-test, which gives you a significance test, and  
the line allows you to decide which morph a particular beetle conforms  
to. I've written some R code to do this (see below). What i would like  
to know is a) if this problem has been dealt with before elsewhere, b)  
if there's a better way to do it and c) if my R function could be  
improved at all - it's my first attempt at such a thing, so any input  
would be really helpful. If anyone wants a sample data set then I can  
email you one.

Thanks for any input

Rob Knell

School of Biological Sciences
Queen Mary, University of London
'Phone +44 (0)20 7882 7720
http://www.qmw.ac.uk/~ugbt794
http://www.mopane.org
Here's the code:

switchline- 
function(X,Y,Intmin,Intmax,Intgap,Slopemin,Slopemax,Slopegap)

 

{

 

#   X   = unimodal variable (e.g. filename$bodysize)

 

#   Y   = bimodal variable (e.g. filename$hornsize)

 

#Intmin   = Minimum switchline intercept

 

#Slopemin = Minimum switchline slope

#Intmax   = Maximum switchline intercept

#Slopemax = Maximum switchline slope

#Intgap   = Interval between each intercept

#Slopegap = Interval between each slope

 

 

 

#   This function written by Rob Knell 2003. It is an attempt to  
produce an

#objective analysis for datasets of apparently dimorphic characters  
in which

#   there is overlap in both the X and Y variable. The routine divides  
the

#individuals into one of two classes based on whether they are  
above or below

#   a line, and fits a linear model to the Y variable with the X  
variable as a

#continuous explanatory variable and a single factor, Morph,  
which is based

#   on whether they were above the line or below it, plus an  
interaction term.

#   The line is then adjusted, the individuals reclassified and the  
process

#repeated. The output is of the form Int - intercept of the line  
dividing

#   one morph from another, Slp - the slope and Rsqr, which gives  
the R-

#squared value for the fitted model when the division into factors  
is based on

#   that particular line.

 

#   An error message probably means that one of your lines has missed  
the data

#altogether - in other words, all of your data points are  
classified into a

#single factor.

#

 

 

 

Lines.grid-  
expand.grid(Intercept=seq(Intmin,Intmax,Intgap),Slope=seq(Slopemin,Slope 
max,Slopegap))

 

bandwidth-length(Lines.grid$Intercept)

 

 

Lines-matrix(nrow=bandwidth,ncol=3,data=NA)

dimnames(Lines)-list(seq(1,bandwidth),c(Int,Slp,Rsqr))

 

Lines[,1]-Lines.grid$Intercept

Lines[,2]-Lines.grid$Slope

 

for (i in 1:bandwidth)

 

{

temp.data-matrix(nrow=length(X),ncol=4,data=NA)

colnames(temp.data)-c(X1,Y1,switchvalue,Morph)

temp.data[,1]-X

temp.data[,2]-Y

for(j in 1:length(X))

{

temp.data[j,3]-(Lines[i,1]+Lines[i,2]*temp.data[j,1])

ifelse(temp.data[j,2]=temp.data[j,3], temp.data[j,4]-1,  
temp.data[j,4]-2)

}



temp.data-data.frame(temp.data)

temp.data$Morph-factor(temp.data$Morph)

model-lm(temp.data$Y1~temp.data$X1*temp.data$Morph)

Lines[i,3]-round(summary(model)$r.squared,6)

}

 

print(Lines)

 

Lines-data.frame(Lines)

Max-max(Lines$Rsqr)

MaxI-Lines$Int[Lines$Rsqr==Max]

MaxS-Lines$Slp[Lines$Rsqr==Max]


RE: [R] ANCOVA when you don't know factor levels

2004-03-17 Thread Liaw, Andy
This sounds like a good case for mixture regression, for which there's Fritz
Leisch's `flexmix' package.  However, I don't think flexmix has facility for
testing whether the mixture has one vs. two components.  Others on the list
surely would know more than I do.

HTH,
Andy

 From: Rob Knell
 
 Hello people
 
 I am doing some thinking about how to analyse data on 
 dimorphic animals  
 - where different individuals of the same species have rather 
 different  
 morphology. An example of this is that some male beetles have large  
 horns and small wings, and rely on beating the other guys up to get  
 access to mates, whereas others have smaller horns and larger wings,  
 and rely on mobility to find mates before the guys with the 
 big horns  
 turn up and beat them up. Normally what we do to see if this is  
 happening is to plot a scatter plot of horn length vs body size. If  
 it's a simple straight line relationship then no dimorphism, but if  
 it's either a line with an obvious change of slope or two separate  
 lines then you've got a dimorphic animal.
 
 If you've just got a change of slope then it's relatively 
 easy to test  
 for significance by breakpoint regression, which will also 
 tell you the  
 body size at which the beetles switch from 'big wings, small horns'  
 strategy to the other. What is more problematic, however, is if you  
 have a dataset that is essentially two linear regressions 
 that overlap  
 in both X and Y. Here you can't do a breakpoint regression. The  
 solution that I have come up with is to put a straight line 
 between the  
 two clouds of data and to use that line to define a two-level factor  
 and fit a model with body size, the factor and the 
 interaction to the  
 horn size data. The line can then be moved up and down, and 
 the slope  
 varied, and the fit of the model for each intercept/slope 
 combination  
 compared. The best fit model can then be compared with a straight  
 linear model with an F-test, which gives you a significance 
 test, and  
 the line allows you to decide which morph a particular beetle 
 conforms  
 to. I've written some R code to do this (see below). What i 
 would like  
 to know is a) if this problem has been dealt with before 
 elsewhere, b)  
 if there's a better way to do it and c) if my R function could be  
 improved at all - it's my first attempt at such a thing, so 
 any input  
 would be really helpful. If anyone wants a sample data set 
 then I can  
 email you one.
 
 Thanks for any input
 
 Rob Knell
 
 School of Biological Sciences
 Queen Mary, University of London
 
 'Phone +44 (0)20 7882 7720
 http://www.qmw.ac.uk/~ugbt794
 http://www.mopane.org
 
 Here's the code:
 
 switchline- 
 function(X,Y,Intmin,Intmax,Intgap,Slopemin,Slopemax,Slopegap)
 
  
 
 {
 
  
 
 #   X   = unimodal variable (e.g. filename$bodysize)
 
  
 
 #   Y   = bimodal variable (e.g. filename$hornsize)
 
  
 
 #Intmin   = Minimum switchline intercept
 
  
 
 #Slopemin = Minimum switchline slope
 
 #Intmax   = Maximum switchline intercept
 
 #Slopemax = Maximum switchline slope
 
 #Intgap   = Interval between each intercept
 
 #Slopegap = Interval between each slope
 
  
 
  
 
  
 
 #   This function written by Rob Knell 2003. It is an attempt to  
 produce an
 
 #objective analysis for datasets of apparently dimorphic 
 characters  
 in which
 
 #   there is overlap in both the X and Y variable. The 
 routine divides  
 the
 
 #individuals into one of two classes based on whether they are  
 above or below
 
 #   a line, and fits a linear model to the Y variable with the X  
 variable as a
 
 #continuous explanatory variable and a single factor, Morph,  
 which is based
 
 #   on whether they were above the line or below it, plus an  
 interaction term.
 
 #   The line is then adjusted, the individuals reclassified and the  
 process
 
 #repeated. The output is of the form Int - intercept of 
 the line  
 dividing
 
 #   one morph from another, Slp - the slope and Rsqr, 
 which gives  
 the R-
 
 #squared value for the fitted model when the division 
 into factors  
 is based on
 
 #   that particular line.
 
  
 
 #   An error message probably means that one of your lines 
 has missed  
 the data
 
 #altogether - in other words, all of your data points are  
 classified into a
 
 #single factor.
 
 #
 
  
 
  
 
  
 
 Lines.grid-  
 expand.grid(Intercept=seq(Intmin,Intmax,Intgap),Slope=seq(Slop
 emin,Slope 
 max,Slopegap))
 
  
 
 bandwidth-length(Lines.grid$Intercept)
 
  
 
  
 
 Lines-matrix(nrow=bandwidth,ncol=3,data=NA)
 
 dimnames(Lines)-list(seq(1,bandwidth),c(Int,Slp,Rsqr))
 
  
 
 Lines[,1]-Lines.grid$Intercept
 
 Lines[,2]-Lines.grid$Slope
 
  
 
 for (i in 1:bandwidth)
 
  
 
 {
 
 temp.data-matrix(nrow=length(X),ncol=4,data=NA)
 
 colnames(temp.data)-c(X1,Y1,switchvalue,Morph)
 
 temp.data[,1]-X
 
 temp.data[,2]-Y
 
 for(j in