The aliasing problem arises in alias(), which is what Anova uses to detect
aliasing. It is simply the fact that anova more or less blithely ignores the
NA's that makes anova behave apparently more 'sensibly' than Anova.
But like Carsten, I found this difficult to understand. Unordered factors are
supposed to be arbitrary. I can understand why A:C behaves differently from A:D
(where D<-factor(rep(1:3,6)). A:C generates a 27-level factor with only 18
levels populated; A:D has only nine levels. But it is not so simple.
I was goig to suggest using AC<-factor(A:C) instead of A %in% C as a fix, as
this generates a 9-level factor with all levels populated - just as A:D does.
But
> alias(lm(resp~A+AC))
generates an alias, where
> alias(lm(resp~A+A:D))
does not.
That seemed to me entirely bizarre. table(A,AC) and table(A,A:D) show identical
balance and nesting. Something, somewhere, is expanding the two differently.
But what?
model.matrix is the 'culprit', I think. The two model matrices differ;
model.matrix(resp~A+A:D) has 9 columns, and model.matrix(resp~A+AC) has eleven.
So now I know what has happened. What I don't understand is why, except that
'that is what model.matrix does with this factor level numbering' (refactoring
either AC or A:D via as.numeric finally generates identical - and aliased -
behaviour)) . I think I am going to invoke the law of unintended consequences
and go find a cold compress.
Steve E
>>> Peter Dalgaard <[EMAIL PROTECTED]> 11/07/2007 16:03:13 >>>
>A term C %in% A (or A/C) is not a _specification_ that C is nested in
>A, it is a _directive_ to include the terms A and C:A. Now, C:A involves
>a term for each combination of A and C, of which many are empty if C is
>strictly coarser than A. This may well be what is confusing Anova().
>In fact, with this (c(1:3,6:11)) coding of C, A:C is completely
>equivalent to C, but if you look at summary(lm(....)) you will see a lot
>of NA coefficients in the A:C case. If you use resp ~ A*B+C, then you
>still get a couple of missing coefficients in the C terms because of
>collinearity with the A terms. (Notice that this is one case where the
>order inside the model formula will matter; C+A*B is not the same.)
*******************************************************************
This email and any attachments are confidential. Any use, co...{{dropped}}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.