My fellow morphmetters,
Regarding Cardini's question, there's no need to invent a new
statistic. The actual Procrustes mean square, in the form of squared
Euclidean distance in Kent's tangent space, is already available for any
such decomposition. This mean square decomposes exactly for any
analysis of variance: by group, by symmetry/asymmetry, by uniform
term/partial warp, by relative warps, by time (T1, T2, change), and in a
variety of other ways. Statistical inferences shouldn't be by any
distributional theory, of course, but instead by permutation test or the
appropriate modification.
Here's how this goes in two special cases.
For grouped data, say, two groups of size N specimens.
Notation: P2(a,b) is squared Procrustes distance between a and b, S is
"sum over all cases", S1 and S2 are "sum over cases of group 1" and "of
group 2", respectively, Fi is the i-th landmark configuration, GM is the
grand Procrustes mean of a sample, and GM1, GM2 are the means of two
subgroups. Then when P2 is approximated as squared Euclidean distance
of the tangent space coordinates:
S P2(Fi,GM) = [S1 P2(Fi,GM1) + S2 P2(Fi,GM2)] + [2N P2(GMi, GM)]
by analogy with any other decomposition of sums of squares into a
within-group term (the first []) and a between-group term (the second
[]). Of course S P2(Fi,GM) is the trace of the pooled sum-of-squares
matrix of the usual shape coordinates: this is a decomposition, not an
invention.
For symmetric data, let there be N forms with some paired
landmarks, and write Y for the "reflected relabelled form" (mirrored and
with left and right labels reversed, as in my morphmet post of May 9).
Then, for GX the mean of the X's, GY the mean of the Y's, and G (which
is symmetric) the mean of the X's and Y's together, we have
S P2(Xi,Yi) = N P2 (GX, GY) + 4S P2(Xi,GX)
= 4N P2(GX,G) + 4S P2(Xi,GX),
a decomposition into directional and fluctuating asymmetry.
A great many more of these identities are available. Taken
together, they are elegant enough to respond to Cardini's question with
the gentle redirection (squared Procrustes distances, not unsquared)
that I indicated at the outset. Of course the usual caveats apply here
that afflict the "generalized variance" problem in any other context --
there's really no such concept apart from specific biological theories
that tell you what landmark-borne information is important and what is
noise or negligible. If you don't like squared Procrustes distance,
it's because you don't like the geometry of the shape coordinates. In
that case, you need some reason to specify a geometry that you DO
prefer, such as Mahalanobis distance (modified for the rank restriction
of the shape coordinates) or some other candidate. In any case, don't
just pick something from a textbook, or something you've seen before.
Understand what the choice of a distance is trying to do, and think
things through like a biologist.
Fred Bookstein
August 11, 2003
==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.