RE: [MORPHMET] Interpreting PCA results

F. James Rohlf Sun, 14 May 2017 17:39:07 -0700

I agree with these comments but would like to add another point. I prefer to 
think that the purpose of the PCA is to produce a low-dimension space that 
captures as much of the overall variation (in a least-squares sense) as 
possible. Within that space there is no need to limit the visualizations to the 
extremes of each axis – one can investigate any direction within that space if 
there is a pattern in the data that suggests an interesting direction. The 
directions of the axes are mathematical constructs and not bases on any 
biological principles. Perhaps one sees some clusters in the PCA ordination but 
the variation within or between clusters need not be parallel to one of the PC 
axes. One can then look in other directions. That is why the tpsRelw program 
allows one to visualize any point in the ordination space – not just parallel 
to the axes. That means for publication one has to decide which directions are 
of interest – not just mechanically display the extremes of the axes.


 

----------------------

F. James Rohlf New email: f.james.ro...@stonybrook.edu

Distinguished Professor, Emeritus. Dept. of Ecol. & Evol.

& Research Professor. Dept. of Anthropology

Stony Brook University 11794-4364

WWW: http://life.bio.sunysb.edu/morph/rohlf

P Please consider the environment before printing this email 

 

From: K. James Soda [mailto:k.jamess...@gmail.com] 
Sent: Sunday, May 14, 2017 7:28 PM
To: dsbriss_dmd <orthofl...@gmail.com>
Cc: MORPHMET <morphmet@morphometrics.org>
Subject: Re: [MORPHMET] Interpreting PCA results

 

Dear David,

Great question!  I disagree with the statement that the samples' variance in 
shape space is not biologically real or, perhaps more accurately, is less real 
than the variance in any other space.  As far as I see it, the basic strategy 
in any biostatistical study, be it GM or otherwise, is that a researcher 
represents a real biological population as an abstract statistical population.  
They then use this abstract statistical population as a proxy for the real one 
so that inferences in the statistical space have implications for the real 
world.  

For example, a PCA finds a direction in the statistical space in which the 
statistical population tends to be spread out.  This is interesting to the 
researcher because this direction has a correspondence to certain real world 
variables.  As a result, the PCA tells the researcher in what ways the real 
population tends to vary.  The key point, though, is that the researcher 
transitioned from the statistical space to the real world.

Moving from shape space to the real world is no different in principle.  We 
have a real population of specimens, whose shape are of interest to us, and we 
represent them using vectors of shape variables.  The vectors are abstractions; 
it is not as if we can hold a vector in our hands.  However, this is irrelevant 
because they are just proxies, no less real than any other quantitative 
representation.  What matters is if we can tie them back to the real world.  
This is why morphometricians implement a visualization step.  In a PCA, the PCs 
describe how our proxies vary, and we visualize in order to see how this 
variation appears in the real world.  It is infeasible to visualize every point 
along this axis, so we instead visualize a handful.  Since the core goal in PCA 
(at least in this context) is to describe variance, we generally describe the 
locations where a visualization occurs in units of standard deviations from the 
mean.  We could use absolute distances along an axis, but this is probably more 
arbitrary than standard deviation units.  The standard deviations come from the 
data's distribution, whereas the absolute distance is really only well-defined 
in the mathematical space.

To summarize: i) Nearly all quantitative analyses involve an abstraction to a 
mathematical space.  ii) The description of points in a mathematical space is 
useful to the researcher because the researcher is able to translate the 
abstract mathematical space into a real world interpretation.  iii) In GM, the 
shape variables are traditionally translated into the real world via 
visualization.  Ergo, morphometricians often interpret PCA results via 
visualizations along individual PCs.  To aid in interpretation, this tends to 
occur in standard deviation units because the standard deviation is more easily 
tied to the real world relative to arbitrary selecting a unit of distance.

Perhaps some of these points are up for debate, but remember that statistics is 
largely the study of VARIATION.  If the variation in shape space did not have 
any biological significance, almost no analysis after alignment would be 
possible.

Hope somewhere in this long commentary, you found something helpful,

James

 

On Tue, May 9, 2017 at 4:56 PM, dsbriss_dmd <orthofl...@gmail.com 
<mailto:orthofl...@gmail.com> > wrote:

Good afternoon all, I have a question about interpretation of PCs.  I have come 
across several articles in orthodontic literature having to do with 
morphometric analysis of sagittal cephalograms that discuss warping a 
Procrustes analysis along a principal component axis.  Essentially the authors 
discuss finding whatever principal components represent shape variance, then 
determining the standard deviation(s) of those PC's, and applying the standard 
deviations to the Procrustes shape to warp the average shape plus or minus.  So 
if you have an average normodivergent Procrustes shape, one warp perhaps in the 
negative direction might give you a brachycephalic shape, while the opposite 
would give you a dolichocephalic shape.  But I don't know where this idea comes 
from.  I have been involved with 8 or 9 morphometrics projects over the last 
few years and I have never been able to figure this out or the rationale for 
performing such an application with the PC results.

 

As an example of what I am talking about here is a passage from the Journal of 
Clinical & Diagnostic research, doi:   
<https://dx.doi.org/10.7860%2FJCDR%2F2015%2F8971.5458> 
10.7860/JCDR/2015/8971.5458

 

"Here, the first 2 PCs are shown & the Average shape (middle) was warped by 
applying each PC by amount equal to 3 standard deviations in negative (left) 
and positive (right) direction {[ 
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347171/figure/F10/> 
Table/Fig-10]: PC1 with standard deviation, [ 
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347171/figure/F11/> 
Table/Fig-11] PC 2 with standard deviation}."

 

I did not include the graphs from the article but if it would help to answer 
this question I can supply them.

 

What I do not quite understand is what exactly is the purpose of applying 
standard deviation(s) to the PCA and then warping the Procrustes average shape 
to these standard deviations?  Maybe my understanding of PCA is limited, but I 
was under the impression that in GPA the principal components are only 
statistical variance, and don't represent something biologically real.  So to 
see how an individual varies from the shape average you have to go back and 
look at whatever landmark(s) represent that specific individual and compare 
that shape to the Procrustes average.  Maybe this is not correct?  

 

Thanks in advance, I appreciate any help you can give me.

 

David

 

 

 

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org 
<mailto:morphmet+unsubscr...@morphometrics.org> .

 

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org 
<mailto:morphmet+unsubscr...@morphometrics.org> .

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.

RE: [MORPHMET] Interpreting PCA results

Reply via email to