I'm currently working on a paper that deals with the problem of
over-parameterizing PCA in morphometrics. The recommendations that I'm
making in the paper are that you should try to have at least 3 times as
many samples as variables. That means that if you have 10 2D landmarks,
you should have at least 60 specimens that you measure. Based on
simulations, if you have fewer than 3 specimens per variable, you quickly
start getting eigenvalues for a PCA that are very different from known true
eigenvalues. I did a literature survey and about a quarter of
morphometrics studies in the last decade haven't met that standard. A good
way to test if you have enough samples is to do a jackknife analysis. If
you cut out about 10% of your observations and still get the same
eigenvalues, then your results are probably stable.
I hope this helps.
On Wed, May 31, 2017 at 1:31 PM, mitte...@univie.ac.at <
> Adding more (semi)landmarks inevitably increases the spatial resolution
> and thus allows one to capture finer anatomical details - whether relevant
> to the biological question or not. This can be advantageous for the
> reconstruction of shapes, especially when producing 3D morphs by warping
> dense surface representations. Basic developmental or evolutionary trends,
> group structures, etc., often are visible in an ordination analysis with a
> smaller set of relevant landmarks; finer anatomical resolution not
> necessarily affects these patterns. However, adding more landmarks cannot
> reduce or even remove any signals that were found with less landmarks, but
> it can make ordination analyses and the interpretation distances and angles
> in shape space more challenging.
> An excess of variables (landmarks) over specimens does NOT pose problems
> to statistical methods such as the computation of mean shapes and
> Procrustes distances, PCA, PLS, and the multivariate regression of shape
> coordinates on some independent variable (shape regression). These methods
> are based on averages or regressions computed for each variable separately,
> or on the decomposition of a covariance matrix.
> Other techniques, including Mahalanobis distance, DFA, CVA, CCA, and
> relative eigenanalysis require the inversions of a full-rank covariance
> matrix, which implies an access of specimens over variables. The same
> applies to many multivariate parametric test statistics, such as
> Hotelling's T2, Wilks' Lambda, etc. But shape coordinates are NEVER of full
> rank and thus can never be subjected to any of these methods without prior
> variable reduction. In fact, reliable results can only be obtained if there
> are manifold more specimens than variables, which usually requires variable
> reduction by PCA, PLS or other techniques, or the regularization of
> covariance matrices (which is more common in the bioinformatic community).
> For these reasons, I do not see any disadvantage of measuring a large
> number of landmarks, except for a waste of time perhaps. If life time is an
> issue, one can optimize landmark schemes as suggested by Jim or Aki.
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
> You received this message because you are subscribed to the Google Groups
> "MORPHMET" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to morphmet+unsubscr...@morphometrics.org.
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email