Hello,
Suppose I have a typical psychological experiment that is a within-subjects design with multiple crossed variables and a continuous response variable. Subjects are considered a random effect. So I could model
> aov1 <- aov(resp~fact1*fact2+Error(subj/(fact1*fact2))
However, this only holds for orthogonal designs with equal numbers of observation and no missing values. These assumptions are easily violated so I seek refuge in fitting a mixed-effects model with the nlme library.
I suppose that you have, for each subject, enough observations to compute his/her average response for each combination of factor1 and factor2, no?
If this is the case, you can perform the analysis with the above formula on the data obtained by 'aggregate(resp,list(subj,fact1,fact2),mean)'.
This is an analysis with only *within-subject* factors and there *cannot* be a problem of unequal number of observation when you have only within-subject factors (supposing you have at least one observations for each subject in each condition).
I believe the problem with unequal number of observations only occurs when you have at least two crossed *between-subject* (group) variables.
Let's imagine you have two binary group factors (A and B) yielding four subgroups of subjects, and for some reason, you do have the same number of observations in each subgroup,
Then there are several ways of defining the main effects of A and B.
In many cases, the most reasonable definition of the main effect of A is to take the average of A in B1 and in B2 (thus ignoring the number of observations, or weithting equally the four subgroups).
To test the null hypothesis of no difference in A when all groups are equally weighted, one common approach in psychology is to pretend that the number of observation is each group is equal to the harmonic mean of the number of observations in each subgroups. The sums of square thud obtained can be compared with the error sum of square in the standard anova to form an F-test.
This is called the "unweighted" approach.
This can easily be done 'by hand' in R, but there is another approach:
You get equivalent statistics as in the unweighted anova when you use so called 'type III' sums of square (I read this in Howell, 1987 'Statistical methods in psychology',
and in John Fox book 'An R and S-plus companion to appied regression, p. 140).
It is possible to get type III sums of square using John Fox 'car' library.
library(car) contrasts(A)=contr.sum contrasts(B)=contr.sum Anova(aov(resp~A*B),type='III')
You can compute the equally weighted cell means defining the effect of A with, say:
with(aggregate(resp,list(a=a,b=b),mean),tapply(x,a,mean))
I have seen some people advise against using 'type III' sums of square but I do not know their rationale. The important thing, it seems to me, is to know
which null hypothesis is tested in a given test. If indeed the type III sums of square test the effect on equally weighted means, they seem okay to me
(when this is indeed the hypothesis I want to test).
Sorry for not answering any of your questions about the use of 'lme' (I hope others will do), but I feel that 'lme' is not needed in the context of unequal cell frequencies.
(I am happy to be corrected if I am wrong). It seems to me that 'lme' is useful when some assumptions of standard anova are violate (e.g. with repeated measurements when the assumption of sphericity is false), or when you have several random factors.
Christophe Pallier http://www.pallier.org
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
