-------- Original Message --------
Subject: How to demonstrate the power of geometric morphometrics?
Date: Wed, 30 Mar 2011 17:27:33 -0400
From: ian sigal <[email protected]>
To: [email protected]
Dear morphometricians,
I am working on a demonstration of geometric morphometrics, but have
run into some trouble. Let me explain. I hope to be clear, even if not
as brief.
I want to apply geometric morphometrics to a structure that has
only been studied with traditional morphometrics. Many in my area are…
let’s just say that they don’t like things that sound like math or
statistics. So, in order to convince them that geometric morphometrics
works and is useful I decided to do a demonstration: I produced a
synthetic set of cases using a simplified model of the specimens we
study. The 2D model was parameterized such that I could produce cases
with controlled variations. I varied things like thickness, curvature,
wiggliness, and other things that have generally been measured with
traditional morphometrics. I exported outlines for many cases, which
were then encoded with Elliptic fourier (EF). The EF coefficients were
then processed by PCA to identify the mean and main modes of
variation, and the modes of variation plotted as mean +- 2SD.
I did the process above first using the program SHAPE, then using
R (and routines described in the book “geometric morphometrics with
R”).
When I did the process with SHAPE, the main modes of variation
identified (Principal components) were a nice match with the imposed
variations. I could see that each of the modes identified could be
related to one of the imposed variations. PC1 represented curvature,
and PC2 represented thickness . I was happy.
When I did the process with R, the principal components could
also be interpreted, but they did not correspond anymore with the
imposed variations. The first PC, for example, varies from thick and
curved to thin and straight, whereas the second PC varies from thick
and straight to thin and curved. In other words, instead of having
curvature and thickness separated into two PCs, they are now in two
mixed PCs. I know that there is no guarantee that the PC’s have to be
easily interpreted, or that they need to be anything like what I
anticipate.
I have no problem with the differences in results, but I don’t
know where they come from, and I am not sure which is the best way to
present this to my community. I have since tested the methods on a
real dataset (using R), and interestingly, the PC’s look identical to
the those in the synthetic dataset obtained from R. Thus I have a
synthetic and a realistic datasets analyzed with the same code and
described by very similar PCs. This suggests that my synthetic dataset
was reasonable, but that was not what I wanted to show! I wanted to
illustrate that interpreting results from geometric morphometrics is
not necessarily much more complex than traditional morphometrics.
My problem is that few in my area understand PCA, let alone are
comfortable interpreting the output. This is why I made the
demonstration case in the first place. If I show that the process of
shape analysis produces modes of variation that match with what I put
in, then they will see the power and believe in the methods. If
instead I show that I put in easy to understand variations, and the
process pulls out mixed modes of variation, then I will have more
trouble convincing people that the methods are useful.
So I wonder if the forum has any comments on these two: 1) I am
sure that forum readers have done something similar and analyzed
shapes of known variation to demonstrate/test the methods. Did your
PC’s represented the imposed variations well? Did they get all mixed
up like mine? If they get all mixed up, any recommendations on how to
convince people that geometric morphometrics is great?
2) How come one way of doing the analysis is finding PCs that match
what I put in whereas another way is finding mixed variations? My
guess is that it is because the variations are also changing centroid
size so the decomposition into PCs is mixing them up. If I had to bet
I would put my money on the way SHAPE and the code in R are doing the
scaling, but I have tried a few scalings and always end up with mixed
PCs in R. I am not suggesting that SHAPE is wrong. Differences could
also be because of differences in the distribution of the points on
the outline between SHAPE and my own implementation or in the rotation
of the axes in the PCA. Any comments?
Thank you all for reading this lengthy post.
Ian