Hi all,

Perhaps somebody can explain the following behaviour to me.

Take the following data.frame.

z <- expand.grid(X = LETTERS[1:3], Y = letters[1:3])

Now, from ?formula we see:

<quote>
The '*' operator denotes factor crossing: 'a*b' interpreted as 'a+b+a:b'.
</quote>

So I would expect the following:

ncol(model.matrix(~X*Y, z)) # returns 1 + 2 + 2 + 2 * 2 = 9

and

ncol(model.matrix(~X + Y + X:Y, z)) # returns 1 + 2 + 2 + 2 * 2 = 9

are equivalent.

However, I did not expect this:

ncol(model.matrix(~X:Y, z)) # returns 1 + 3 * 3 = 10

Why isn't this 5? In other words, why doesn't "~X:Y" just denote the interaction term so that all you would get is an intercept plus the two-way interaction between X and Y (1 + 2 * 2 = 5 parameters)? Instead what is returned is the fully crossed effects (every level of X against every level of Y) plus an intercept. Is there something in the documentation I'm missing?

--sundar

P.S. This behaviour is identical in S-PLUS 6.2.

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to