Hadley, The S language modeling language was designed with Wilkinson and Rogers in mind. The notation was changed from their paper to retain consistency with the parsing rules for ordinary algebra in S. I think of ":" as an indicator of an indexing system into the dummy variables. It is not an indicator of degrees of freedom.
For simplicity in notation, let A be a factor with a levels and B be a factor with b levels. Then A:B implies a set of dummy variables with at most ab columns indexed by an A level and a B level. The degrees of freedom associated with A:B depends on the linear dependencies of the associated dummy variables with the dummy variables of other terms in the model. The excess columns can be suppressed when the dummy variables are generated or they can be pivoted out during the analysis. When we have the special case A:A, there is only one factor mentioned, so the indexing scheme is based on just the one factor. You could generate the full set of a^2 columns, and then you would discover that they are all linearly dependent on the first a. The columns can be labeled either a1b1 a1b2 a1b3 a2b1 a2b2 a2b3 or a1b1 a2b1 a1b2 a2b2 a1b3 a2b3 If there is crossing, we would report the a single sum of squares and degrees of freedom for the interaction. If there is nesting, say a/b , then it might make sense to group the dummy variables say (a1b1 a1b2 a1b3) and (a2b1 a2b2 a2b3) and report simple effects sum of squares and degrees of freedom for each of the groups. The structure of the individual columns depends on the set of contrasts used for the A and B factors. Rich [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.