RE: [MORPHMET2] Questions about Kendall’s shape space and tangent space projection

f.james.ro...@stonybrook.edu Wed, 08 Sep 2021 19:05:37 -0700

Good questions. The topic can be confusing and difficult to visualize - 
especially for 3D landmark data.


The 1999 paper in the Journal of Classification  that Adams mentions was my 
attempt to describe its practical relevance in morphometrics. My 1999 paper in 
Hystrix may also be of interest to you. You may wish to play with my tpsTri 
software also as it was used for some of the figures in those papers.

Easiest to at first just think of variation in shapes of triangles in 2 
dimensions. As you mention, Kendall's shape space for triangles can be 
visualized as the surface of a sphere of radius 1/2. What convinced me of the 
importance of Kendall's shape space was that Kendall showed that the 
distribution of all possible triangles was a uniform distribution in Kendall's 
shape space.  After a GPA the distribution of shapes is on a hemisphere of 
radius 1 (corresponding to centroid size of 1).  I am not sure if anyone has 
given a good name for this hemisphere corresponding to all possible triangles 
Procrustes aligned to any single shape. In 1999 I called it a "preshape space 
of triangles aligned to a reference triangle". Not very snappy!  Perhaps I 
should have called it something like the "Slice hemisphere"  as Dennis Slice 
first showed it to me and was puzzled why it was not a surface of a sphere. 

The distribution of triangles is not, however, uniform on the surface of the 
hemisphere. As you mention, conventional multivariate methods assume linear 
spaces and use linear matrix algebra. The tangent space is as you mention the 
projection of points from the surface of the hemisphere onto a plane that 
passes just through the point on the hemisphere that corresponds to the 
reference triangle (not really a "reference" just the mean shape in practical 
applications). I have called it Kendall's tangent space but I probably should 
have named it after John Kent as he influenced my understanding. Something I 
found fascinating was that the distribution of all possible triangles was again 
uniform in the projection within the circular distribution (Kendall showed that 
also). The importance, to me, of being uniform was that if I see some pattern 
in the distribution (clusters, covaniance, etc.) then it implies something 
about the distribution of shapes not just an artifact of the mathematical 
operations used to create the projection (as in the case of EDMA and some other 
earlier  statistical approaches suggested for analyzing shape variation).

 Of course, for very small variation in shape the points will be close to their 
mean so that distances on the surface of the hemisphere (thus close to their 
projections) and distances in the tangent space will be very similar (though 
distance in the tangent space will be a little larger due to the projection). 
Might be good enough for some studies but if one does not do the projection 
then you will find that a PCA of your GPA  aligned data (assuming large n > 
2p-4) will not yield 4 zero eigenvalues as it should with centering, rotation, 
and size removed. Only 3 will be zero (i.e., computationally numbers like 
10^-15 or so). The 4th smallest might be "only" 10^-8 or so. That is a result 
of the curved shape of the hemisphere. If you do the projection then the last 4 
eigenvalues will be essentially zero as the curvature is now gone.

An alternative is to perform the multivariate analysis directly in Kendall's 
shape space. Kent (I cannot locate the references right now but it was in early 
2000s) showed one could, for example, perform a generalization of a PCA 
directly in the curved surface. Some odd properties as eigenvectors were great 
circles on the curved surfaces as I remember).

One can generalize, of course from triangles to shapes with more landmarks and 
to landmarks in 3 dimensions. The 3-dimensional case is more complicated to try 
to visualize because the simplest case requires 5 dimensions to represent not 
just 3 so one cannot just look at the space. It also has some more complicated 
properties. You could look at the book "Shape & shape theory" by Kendall, 
Barden, and Le (1999). It presents a way to visualize variation in the 
5-dimensional shape space. Was not an easy read for me but perhaps I should try 
again!

This distinction between shape space and tangent space is not of much 
importance in practical applications where biological variation tends to be 
small compared to all possible variation among p landmarks and because one 
usually only looks at the distribution along the first few eigenvectors with 
the largest eigenvalues but I prefer to have computations match what one 
expects theoretically rather than just being good approximations. When 
programming being just "close enough" could hide subtle bugs. Getting rid of 
that known artifact also allows one to try to possibly interpret those 
eigenvectors with the smallest eigenvalues as they correspond to the most 
stable aspects of possible shape variation (i.e., least varying due to 
development, environment, etc.). 

Does this help or confuse more?

F. James Rohlf                                    
Distinguished Professor, Emeritus and Research Professor
Depts: Anthropology and Ecology & Evolution
Stony Brook University
On 9/8/2021 2:26:38 PM, Adams, Dean [EEOB] <dcad...@iastate.edu> wrote:
Karolin,
 
A reading of Rohlf 1999 may help.
 
Dean
 
Rohlf, F.J. 1999. Shape statistics: Procrustes superimpositions and tangent 
spaces. Journal of Classification. 16:197-223.
 
Dr. Dean C. Adams
Distinguished Professor of Evolutionary Biology
Director of Graduate Education, EEB Program
Department of Ecology, Evolution, and Organismal Biology
Iowa State University
https://faculty.sites.iastate.edu/dcadams/ 
[https://faculty.sites.iastate.edu/dcadams/]
phone: 515-294-3834
 
From: morphmet2@googlegroups.com <morphmet2@googlegroups.com> On Behalf Of 
karolin....@gmail.com
Sent: Tuesday, September 7, 2021 7:04 AM
To: Morphmet <morphmet2@googlegroups.com>
Subject: [MORPHMET2] Questions about Kendall’s shape space and tangent space 
projection
 
Dear Morphometricians,

I am currently trying to understand the mathematical backgrounds of 
landmark-based geometric morphometrics. Some questions arose that we could not 
answer during discussions in our lab which is why I hope you can help - many 
thanks in advance!

The first question is: What exactly is “Kendall’s shape space”? If I understand 
Kendall’s (1984) statement in Eq. 4 correctly, the shape space is a quotient 
space; the elements are equivalence classes of pre-shapes (a fiber on the 
pre-shape sphere becomes one element in shape space). The elements of the 
equivalence classes have less “coordinates” (vector elements) than the original 
landmark configuration and lie on a hyperdimensional sphere with a radius of 1. 
In Theorem 2 Kendall (1984) states that the shape space for triangles is 
isometric to a three-dimensional sphere with a radius of 0.5. The triangles on 
this sphere with a radius of 0.5 are represented by three Cartesian coordinates 
that are calculated from the original landmark configuration (Kendall 1984, 
section 5), whereas the triangles are represented by equivalence classes in 
shape space.
In several publications I now find illustrations of a hemisphere of radius 1 
and a sphere of radius 0.5 (both share one point at the pole); those 
publications usually use the full landmark set. The sphere of radius 0.5 is 
often termed “Kendall’s shape space” (sometimes with a reference to triangles, 
sometimes not). So, how does this fit with the definitions and statements in 
Kendall (1984)? Is there a publication that extends Kendall (1984) to the use 
of full landmark configurations and explains how they are (mathematically) 
related to the sphere with radius 0.5 (for all numbers and dimensions of 
landmarks)? Related to this question: what do the points on the sphere of 
radius 0.5 in those publications look like? Are they equivalence classes, full 
landmark configurations, or 3 cartesian coordinates representing triangles? Are 
they really scaled to unit centroid size as the shapes on the pre-shape sphere 
[= elements of equivalence classes in shape space]?

The second question is: Why do we need a tangent space projection? I understand 
that the superimposed landmark configurations lie on a hyper-hemisphere and I 
know the argument that standard statistical procedures need a linear space. 
Yet, the superimposed landmark configurations are matrices or vectors, 
depending on how they are formatted, for which we can compute Euclidean 
distances. Where exactly do the statistical tests go wrong if we use the 
superimposed landmark configurations without tangent space projection and 
calculate Euclidean distances?
If I, for example, think about MANOVAs as suggested by Anderson (2001, Austral 
Ecology), I guess that the mean shapes of the groups need to be calculated to 
be able to calculate the different sums of squares. If the mean “shape” is 
calculated by group-wisely simply calculating the mean of each of the 
coordinates, the resulting mean “shape” of each group lies within the 
hyper-hemisphere of radius 1. So the mean “shape” is not a shape because the 
centroid size is not standardized. Yet, if I got all distance calculations 
correctly (see attached R-script 
“Compare_distance_measures_in_original_and_tangent_space.R”), I find that the 
Euclidean distances between the mean “shapes” inside the hyper-hemisphere are 
slightly closer to the corresponding Procrustes distances than the Euclidean 
distances in tangent space; the Procrustes distances have been calculated by 
rescaling the mean “shapes” to unit centroid size followed by determining the 
arc length between them. If the mean “shapes” inside the hyper-hemisphere are 
rescaled to unit centroid size, then the Euclidean distances between them are 
even closer to the Procrustes distances.
In addition, if I simulate groups of landmark configurations, superimpose them 
without and with tangent space projection, and test for significant differences 
between the groups, I feel that the decision on the significance of the group 
differences is correct slightly more often if it is based on the superimposed 
landmark sets without tangent space projection (not exhaustively or formally 
tested; see R-script 
“compare_ProcrustesMANOVA_in_original_and_tangent_space.R”).

And one last, more general question: If all landmark configurations are 
superimposed onto a common mean shape, does this also minimize the Procrustes 
distances (measured as arc length) between all pairs of landmark configurations 
and between the mean shapes of sub-groups of landmark configurations?

Thanks a lot for your insights!

Kind regards
Karo
--
Dr. Karolin Engelkes
Institute of Evolutionary Biology and Animal Ecology
University of Bonn

Phone: +49 (0) 228 73 5481

An der Immenburg 1
53121 Bonn
Germany
--
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet2+unsubscr...@googlegroups.com 
[mailto:morphmet2+unsubscr...@googlegroups.com].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/ac177d3d-b28a-41b2-bfb6-a7a9f8cd73d0n%40googlegroups.com
 
[https://groups.google.com/d/msgid/morphmet2/ac177d3d-b28a-41b2-bfb6-a7a9f8cd73d0n%40googlegroups.com?utm_medium=email&amp;utm_source=footer].

-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet2+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/Mailbird-a92a6485-db7d-4c94-82a2-397208a82e0f%40stonybrook.edu.

RE: [MORPHMET2] Questions about Kendall’s shape space and tangent space projection

Reply via email to