I, too, think of ANOVA and regression as variations on a common
theme.  Here's an additional way in which they differ:  For balanced
ANOVA, the decomposition of the data into sums of squares and degrees of
freedom is determined by a group of symmetries.  For example, consider a
one-way randomized complete block design with R rows as blocks and
C columns as treatments.  The analysis is invariant under all row
permutations, and all column permutations, i.e., interchanging any two
rows of the data, or any two columns of the data, won't change the
analysis.  If you now think of the data as a vector in RxC-dimensional
space, the symmetries (row permutations, column permutations) determine
invariant subspaces; these are precisely the subspaces you project the
data vector onto to get the SSs and dfs.  In regression, the subspaces
you project onto are determined directly by a spanning set of carrier
variables; in balanced ANOVA, the subspaces are uniquely determined by the
symmetries, and the spanning sets are somewhat arbitrary.  (I claim 
no credit for this lovely way of looking at things; I learned it from
Peter Fortini  and Persi Diaconis.  It's written up in Fortini's
dissertation from the 1970s, and Diaconis's IMS lecture notes on group
theory and statistics.)

Of course you only have such clean sets of symmetries for balanced
designs, and the approach via symmetries doesn't address such things as
the difference between fixed and random effects, which Bob 
Wheeler raises.  Nevertheless, to the extent that I think of ANOVA
as distinct from regression, I find the role of symmetries worth
keeping in mind.

  George

George W. Cobb
Mount Holyoke College
South Hadley, MA  01075
413-538-2401




=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to