[The following was submitted by Dr. Fred L. Bookstein in response to the
query posted by Dr. Thomas M. Greiner. It follows with some requested
editing and reformatting by me. -the morphmet moderator (dslice)]

Procrustes distance is one of a large family of occasionally reasonable
measures that permit a sorting of all pairs of specimens from your
original data set in decreasing order of an intuitively sensible kind of
"similarity."  The Proc. dist. corresponds to a geometry according to
which each shape is a point in some curving manifold and the intuition
about similarity comes from the claim that one is free to move a
specimen, resize a specimen, or rotate a specimen in any way whatsoever
without any effect on that subjective "similarity," which otherwise
should come as close as possible to ordinary Euclidean distance (but
why?). But whether or not that Procrustes distance is a defensible idea,
most of our biometric statistical methods don't work with distances,
which is to say that they don't work on curving manifolds, anyway.
Instead they work only in linear spaces that require data sets to be
expressed as vectors of variable values, not matrices of interspecimen
distances.

The tangent spaces you're asking about are exactly the spaces that do
this translating. You need one of these spaces if you are going to look
at correlations between shape and its causes or effects, or if you are
trying to look at modularity or integration of form, symmetry and
asymmetry, growth-gradients or other geometric descriptors of factors of
form, reconstruction forms or estimation of missing data, and the like.
 You don't need this space for significance tests of distances
(similarities) across groups, and there is nothing either "conservative"
or "liberal" about it.  The appropriate questions of that sort deal with
the selection of the Procrustes metric itself, not the conversion to
vectors. You are "conservative" if you think that a similarity measure
must have certain mathematical properties; you are "liberal" if you
think the scientist is free to measure any way he or she wishes.

The particular tangent space you are wrestling with about is, by
theorem, the (unique) linear space that supplies the best approximation
to the original Procrustes distances near the Procrustes average form.
If you don't need it, don't use it.  For instance, if all you're doing
is testing differences of group average shape, you don't need the
tangent space; if you want to see a group average shape, you don't need
it (although it gets the right answer); if you want to just look at two
average shapes and write an essay, you don't need it. But if you want to
see a quantitative dissection of the relation between two shapes, or two
average shapes, or see how something correlates with a shape difference,
or visualize a principal coordinate ( = principal component = relative
warp) of shape, you do need it.  The choice of tangent space vs. shape
metric is driven by the nature of the question and the sense of
"similarity" that the scientist is using, not by the data per se and
certainly not by the statistics of the data.

The language of Mahalanobis D's has no relation to this framework -- it
conveys answers to questions about statistical distributions of vectors,
not about similarities -- nor does the language of statistical
significance tests. You need the tangent space for ordination, and you
need it for quantitative description of differences beyond the ineffable
"similarity" that goes into those distance matrices. So the question is
most emphatically NOT "when is shape space projection required or when
is the tangent space projection sufficient?" but "when is analysis of my
arbitrary 'dissimilarity' good enough, and when do I need vectors
instead?".  If distances are enough, you don't ever need a projection,
but of course you then need to argue (1) why it is sufficient just to
talk about dissimilarities, and (2) why the Procrustes distance is the
one you should be using. (This is a difficult argument to win.)  If you
need vectors, you have to get them from some metric space using some
projection. The Procrustes tangent space is the one in which vector
sums-of-squares come closest to minimum Euclidean sums-of-squares in the
(infinitesimal) vicinity of the sample average shape (which can be
determined without any mention of tangent space), and it generates a
great number of geometrically valid diagrams (uniform shape changes,
relative warps, semilandmarks, thin-plate splines, etc.) that often
extend the scientist's original intuition of what shape similarity was
supposed to entail.

In short, if you only care about significance tests, you have no use for
tangent spaces, only for distance matrices.  If you want to understand
biological processes, you can't really make any progress just by talking
about similarities; you need the language of vector spaces, and the
tangent space to the Kendall shape manifold is the optimal vector space
for most of these purposes.

Fred Bookstein

-- 
Replies will be sent to the list.
For more information visit http://www.morphometrics.org

Reply via email to