I agree with these comments but would like to add another point. I prefer to think that the purpose of the PCA is to produce a low-dimension space that captures as much of the overall variation (in a least-squares sense) as possible. Within that space there is no need to limit the visualizations to the extremes of each axis – one can investigate any direction within that space if there is a pattern in the data that suggests an interesting direction. The directions of the axes are mathematical constructs and not bases on any biological principles. Perhaps one sees some clusters in the PCA ordination but the variation within or between clusters need not be parallel to one of the PC axes. One can then look in other directions. That is why the tpsRelw program allows one to visualize any point in the ordination space – not just parallel to the axes. That means for publication one has to decide which directions are of interest – not just mechanically display the extremes of the axes.

---------------------- F. James Rohlf New email: f.james.ro...@stonybrook.edu Distinguished Professor, Emeritus. Dept. of Ecol. & Evol. & Research Professor. Dept. of Anthropology Stony Brook University 11794-4364 WWW: http://life.bio.sunysb.edu/morph/rohlf P Please consider the environment before printing this email From: K. James Soda [mailto:k.jamess...@gmail.com] Sent: Sunday, May 14, 2017 7:28 PM To: dsbriss_dmd <orthofl...@gmail.com> Cc: MORPHMET <morphmet@morphometrics.org> Subject: Re: [MORPHMET] Interpreting PCA results Dear David, Great question! I disagree with the statement that the samples' variance in shape space is not biologically real or, perhaps more accurately, is less real than the variance in any other space. As far as I see it, the basic strategy in any biostatistical study, be it GM or otherwise, is that a researcher represents a real biological population as an abstract statistical population. They then use this abstract statistical population as a proxy for the real one so that inferences in the statistical space have implications for the real world. For example, a PCA finds a direction in the statistical space in which the statistical population tends to be spread out. This is interesting to the researcher because this direction has a correspondence to certain real world variables. As a result, the PCA tells the researcher in what ways the real population tends to vary. The key point, though, is that the researcher transitioned from the statistical space to the real world. Moving from shape space to the real world is no different in principle. We have a real population of specimens, whose shape are of interest to us, and we represent them using vectors of shape variables. The vectors are abstractions; it is not as if we can hold a vector in our hands. However, this is irrelevant because they are just proxies, no less real than any other quantitative representation. What matters is if we can tie them back to the real world. This is why morphometricians implement a visualization step. In a PCA, the PCs describe how our proxies vary, and we visualize in order to see how this variation appears in the real world. It is infeasible to visualize every point along this axis, so we instead visualize a handful. Since the core goal in PCA (at least in this context) is to describe variance, we generally describe the locations where a visualization occurs in units of standard deviations from the mean. We could use absolute distances along an axis, but this is probably more arbitrary than standard deviation units. The standard deviations come from the data's distribution, whereas the absolute distance is really only well-defined in the mathematical space. To summarize: i) Nearly all quantitative analyses involve an abstraction to a mathematical space. ii) The description of points in a mathematical space is useful to the researcher because the researcher is able to translate the abstract mathematical space into a real world interpretation. iii) In GM, the shape variables are traditionally translated into the real world via visualization. Ergo, morphometricians often interpret PCA results via visualizations along individual PCs. To aid in interpretation, this tends to occur in standard deviation units because the standard deviation is more easily tied to the real world relative to arbitrary selecting a unit of distance. Perhaps some of these points are up for debate, but remember that statistics is largely the study of VARIATION. If the variation in shape space did not have any biological significance, almost no analysis after alignment would be possible. Hope somewhere in this long commentary, you found something helpful, James On Tue, May 9, 2017 at 4:56 PM, dsbriss_dmd <orthofl...@gmail.com <mailto:orthofl...@gmail.com> > wrote: Good afternoon all, I have a question about interpretation of PCs. I have come across several articles in orthodontic literature having to do with morphometric analysis of sagittal cephalograms that discuss warping a Procrustes analysis along a principal component axis. Essentially the authors discuss finding whatever principal components represent shape variance, then determining the standard deviation(s) of those PC's, and applying the standard deviations to the Procrustes shape to warp the average shape plus or minus. So if you have an average normodivergent Procrustes shape, one warp perhaps in the negative direction might give you a brachycephalic shape, while the opposite would give you a dolichocephalic shape. But I don't know where this idea comes from. I have been involved with 8 or 9 morphometrics projects over the last few years and I have never been able to figure this out or the rationale for performing such an application with the PC results. As an example of what I am talking about here is a passage from the Journal of Clinical & Diagnostic research, doi: <https://dx.doi.org/10.7860%2FJCDR%2F2015%2F8971.5458> 10.7860/JCDR/2015/8971.5458 "Here, the first 2 PCs are shown & the Average shape (middle) was warped by applying each PC by amount equal to 3 standard deviations in negative (left) and positive (right) direction {[ <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347171/figure/F10/> Table/Fig-10]: PC1 with standard deviation, [ <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347171/figure/F11/> Table/Fig-11] PC 2 with standard deviation}." I did not include the graphs from the article but if it would help to answer this question I can supply them. What I do not quite understand is what exactly is the purpose of applying standard deviation(s) to the PCA and then warping the Procrustes average shape to these standard deviations? Maybe my understanding of PCA is limited, but I was under the impression that in GPA the principal components are only statistical variance, and don't represent something biologically real. So to see how an individual varies from the shape average you have to go back and look at whatever landmark(s) represent that specific individual and compare that shape to the Procrustes average. Maybe this is not correct? Thanks in advance, I appreciate any help you can give me. David -- MORPHMET may be accessed via its webpage at http://www.morphometrics.org --- You received this message because you are subscribed to the Google Groups "MORPHMET" group. To unsubscribe from this group and stop receiving emails from it, send an email to morphmet+unsubscr...@morphometrics.org <mailto:morphmet+unsubscr...@morphometrics.org> . -- MORPHMET may be accessed via its webpage at http://www.morphometrics.org --- You received this message because you are subscribed to the Google Groups "MORPHMET" group. To unsubscribe from this group and stop receiving emails from it, send an email to morphmet+unsubscr...@morphometrics.org <mailto:morphmet+unsubscr...@morphometrics.org> . -- MORPHMET may be accessed via its webpage at http://www.morphometrics.org --- You received this message because you are subscribed to the Google Groups "MORPHMET" group. To unsubscribe from this group and stop receiving emails from it, send an email to morphmet+unsubscr...@morphometrics.org.