Dear Asta,
I think the problem with discriminant analysis and CVA when n is small
relative to p is not so much a statistical but also a scientific one.
It is possible to use a generalized inverse which is in principle the
same as using all principal components with non-zero eigenvalues
(although Jim's concerns or of course right). Principal coordinate
analysis is identical to PCA when the distances used are the Euclidian
distances.
The problem with discriminant analysis and CVA is that when p is large
enough, one can separate any groups - regardless of their actual means.
CVA is thus not easily interpretable as an ordination method. These two
techniques where instead designed to predict the group identity of new
cases. But again, if p is to large, the reference sample will be
discriminated (too) well but the power to predict new cases might be
low. The model will be too specific.
I would suggest to use only very few PCs for CVA or, better, use PCA for
ordination. If you are interested what in fact are the differences in
average shape among groups compare group mean shapes instead
interpreting PCA or CVA coefficients.
Best
Philipp Mitteroecker
On Do, 10.08.2006, 19:16, morphmet wrote:
Hello,
does anybody have a good suggestion on ordination of populations when
sample sizes per population is small (smaller than the number of
variables). The data is of "traditional" linear measurements.
Possibilities:
1) Normally I would conduct CVA. However NTsys would not do it when
sample sizes are smaller than no variables.
Some other standard packages can overcome this problem. However, as I
was suggested by F.J. Rohlf:
"The mathematical requirement for a CVA to be
possible is for the within-groups degrees of freedom (total n
minus the number of groups) must be equal to or larger than the
number of variables. There are tricks such as using a generalized
inverse rather than an ordinary inverse but one would not want to
trust the results very much. In fact one does not trust the
results statistically unless the degrees of freedom are quite a
bit larger than the number of variables"
2) Do PCA to summarise the information on the first principal
components. Conduct CVA on the first PCA scores. Minus: it is difficult
to find out which of the original variables were important. Moreover,
what if appart for the PC1 (summarising size) other PCs have rather
similar eigenvalues, and there is still a lot of information lost in
discarding them
3) Some other solutions: principal coordinate analysis - does it make
sense?
Thanks,
Asta
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org