I think you meant to say that you are writing a study design paper
presenting results of simulations and power analysis to determine
appropriate sample sizes for multivariate analyses in geometric
morphometrics. But I would think that would have already been settled by
now, and possibly would be more relevant for certain clustering methods.
The only parameterized PCA variant I am aware of is Kernel PCA, which is a
nonlinear PCA method used for pattern analysis (e.g. used in image
analysis), but that is not often employed in biological geometric
morphometrics papers (at least, those that I frequently come across). When
kernels are used they usually are meant to estimate densities of
reduced-dimensionality data like CS, or PCs as shape variables.
Justin C. Bagley, Ph.D.
Plant Evolutionary Genomics Laboratory
Department of Biology
Virginia Commonwealth University
Richmond, VA 23284-2012
Senior/Postdoctoral Research Associate
Departamento de Zoologia
Universidade de Brasília
Campus Universitário Darcy Ribeiro
70910-900 Brasília, DF, Brasil
Lattes CV: http://lattes.cnpq.br/0028570120872581
On Wed, May 31, 2017 at 6:41 PM, William Gelnaw <wgel...@gmail.com> wrote:
> I'm currently working on a paper that deals with the problem of
> over-parameterizing PCA in morphometrics. The recommendations that I'm
> making in the paper are that you should try to have at least 3 times as
> many samples as variables. That means that if you have 10 2D landmarks,
> you should have at least 60 specimens that you measure. Based on
> simulations, if you have fewer than 3 specimens per variable, you quickly
> start getting eigenvalues for a PCA that are very different from known true
> eigenvalues. I did a literature survey and about a quarter of
> morphometrics studies in the last decade haven't met that standard. A good
> way to test if you have enough samples is to do a jackknife analysis. If
> you cut out about 10% of your observations and still get the same
> eigenvalues, then your results are probably stable.
> I hope this helps.
> - Will
> On Wed, May 31, 2017 at 1:31 PM, mitte...@univie.ac.at <
> mitte...@univie.ac.at> wrote:
>> Adding more (semi)landmarks inevitably increases the spatial resolution
>> and thus allows one to capture finer anatomical details - whether relevant
>> to the biological question or not. This can be advantageous for the
>> reconstruction of shapes, especially when producing 3D morphs by warping
>> dense surface representations. Basic developmental or evolutionary trends,
>> group structures, etc., often are visible in an ordination analysis with a
>> smaller set of relevant landmarks; finer anatomical resolution not
>> necessarily affects these patterns. However, adding more landmarks cannot
>> reduce or even remove any signals that were found with less landmarks, but
>> it can make ordination analyses and the interpretation distances and angles
>> in shape space more challenging.
>> An excess of variables (landmarks) over specimens does NOT pose problems
>> to statistical methods such as the computation of mean shapes and
>> Procrustes distances, PCA, PLS, and the multivariate regression of shape
>> coordinates on some independent variable (shape regression). These methods
>> are based on averages or regressions computed for each variable separately,
>> or on the decomposition of a covariance matrix.
>> Other techniques, including Mahalanobis distance, DFA, CVA, CCA, and
>> relative eigenanalysis require the inversions of a full-rank covariance
>> matrix, which implies an access of specimens over variables. The same
>> applies to many multivariate parametric test statistics, such as
>> Hotelling's T2, Wilks' Lambda, etc. But shape coordinates are NEVER of full
>> rank and thus can never be subjected to any of these methods without prior
>> variable reduction. In fact, reliable results can only be obtained if there
>> are manifold more specimens than variables, which usually requires variable
>> reduction by PCA, PLS or other techniques, or the regularization of
>> covariance matrices (which is more common in the bioinformatic community).
>> For these reasons, I do not see any disadvantage of measuring a large
>> number of landmarks, except for a waste of time perhaps. If life time is an
>> issue, one can optimize landmark schemes as suggested by Jim or Aki.
>> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
>> You received this message because you are subscribed to the Google Groups
>> "MORPHMET" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to morphmet+unsubscr...@morphometrics.org.
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
> You received this message because you are subscribed to the Google Groups
> "MORPHMET" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to morphmet+unsubscr...@morphometrics.org.
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email