Re: CVA limitations?

morphmet Fri, 03 Apr 2009 06:31:49 -0700


-------- Original Message --------
Subject: Re: CVA limitations?
Date: Thu, 2 Apr 2009 10:24:06 -0700 (PDT)
From: F. James Rohlf <[email protected]>
Reply-To: [email protected]
To: Morphmet <[email protected]>
References: <[email protected]>

My earlier comment about the minimum sample size was to provide amathematical minimum that permits computing the inverse of the pooledwithin covariance matrix. However, a reasonable statistical minimum islikely to be much larger in order to get stable results.

One cannot just specify some absolute number as being reasonable becauseit depends on just how different the groups are. It is useful to do apilot study to get an estimate of the differences and then plan a moredefinitive study for the actual tests.

Sample sizes also depend on whether you just want to demonstrate thatthere is a difference or if you wish to describe the nature of thedifference (which requires a larger n).

--------------------------------
Sent remotely by F. James Rohlf

-----Original Message-----
From: morphmet <[email protected]>

Date: Thu, 02 Apr 2009 12:59:17
To: morphmet<[email protected]>
Subject: Re: CVA limitations?

-------- Original Message --------
Subject:        Re: CVA limitations?
Date:   Thu, 2 Apr 2009 09:56:01 -0700 (PDT)
From:   J. Willacker <[email protected]>
To:     [email protected]
References:     <[email protected]>

Thanks everyone for your replies.

My landmark suite includes 20 points, therefore I have 40 variables.  I
have been doing a minimum of 40 specimens from each population, but if I
understand correctly I should consider doing more.  How do I know what
my goal within population N should be?  I have more than enough fish
from each population (for most I have 500+ fish) but with this many
populations time becomes an issue when my Ns get too high.  A college is
using sample sizes as low as 20 for similar analyses (with the same 20
landmarks), but that doesn't seem valid.

Really, I am very new to these types of analyses and have some trouble
understanding how they do what they do.  I realize that no matter how
many fish I include, the CVA could not possibly separate ALL
populations.  Ultimately, my goal is to identify populations with unique
head morphologies (very "benthic" or very "limnetic") for use in my
studies of trophic morphology/ecology.  Given my purpose, is there a
different analysis that would be more appropriate?

On Thu, Apr 2, 2009 at 4:27 AM, morphmet
<[email protected]
<mailto:[email protected]>> wrote:

     -------- Original Message --------
     Subject: Re: CVA limitations?
     Date: Thu, 2 Apr 2009 05:25:21 -0700 (PDT)
     From: andrea cardini <[email protected]
     <mailto:[email protected]>>
     To: [email protected] <mailto:[email protected]>

     Just a quick comment about Philipp's points.

The rule of thumb suggested by textbooks is more restrictive thanJim's

     "minimum sample size" as requires N of the smallest group to be
     larger than the number of variables. This seems to imply that the
     minimum requirement mentioned by Jim is met.
     Also, it's true that "CVA will always separate groups even if the
     share the same mean configuration" but in that case the
     cross-validation will likely produce hit-ratios which are no better
     than chance (that was about my last point in the previous message).

     Unfortunately most of the time people doing taxonomy like myself are
     in the situation exemplified by Paul's case (below) where N is very
     unequal across groups and there are at least a few groups with N
     much smaller than the number of variables. Then, one may or may not
     be computationally able to do a DA/CVA but assumptions are unlikely
     to be met, hard to verify and (even more concerning) sampling error
     may lead to inaccurate estimates of means, variances etc.
     Resampling statistics may help but won't do anything about the
     accuracy of estimates and one can only acknowledge that results (at
     least those
     concerning smallest samples) will have to be verified on larger
samples.

     I'd like also to remember that besides sample size, one should
carefully
     consider provenance of specimens and maybe also time of specimen
     collection. A small sample of individuals collected at the same time
     and in the same locality could make things even worse if one is
     interested in estimating means and their variation in the whole
     population. Again, this is not uncommon for rare species/subspecies
     from museum collections.

     Cheers

     Andrea




     At 07:43 02/04/2009 -0400, you wrote:



         -------- Original Message --------
         Subject: Re: CVA limitations?
         Date: Thu, 2 Apr 2009 04:26:50 -0700 (PDT)
         From: Paul Van Daele <[email protected]
         <mailto:[email protected]>>
         To: <[email protected]
<mailto:[email protected]>>
         References: <[email protected]
         <mailto:[email protected]>>

         what if total sample size is larger than the number of variables
         but some
         groups have a lower sample size than the number of variables?

         Say eg you have 26 variables and three groups with resp. 40, 15
         and 5
         specimens


         Paul Van Daele
         Ghent University
         Evolutionary Morphology of Vertebrates
         KL Ledeganckstraat 35
         B-9000 Gent
         Belgium
         [email protected] <mailto:[email protected]>
         Tel +32 92645233
         Fax +32 92645344

         Do not go gentle into that good night (D. Thomas)
         ----- Original Message -----
         From: "morphmet" <[email protected]
         <mailto:[email protected]>>
         To: "morphmet" <[email protected]
         <mailto:[email protected]>>
         Sent: Thursday, April 02, 2009 1:08 PM
         Subject: Re: CVA limitations?




             -------- Original Message --------
             Subject: Re: CVA limitations?
             Date: Wed, 1 Apr 2009 16:04:19 -0700 (PDT)
             From: Philipp Mitteröcker <[email protected]
             <mailto:[email protected]>>
             To: [email protected]
             <mailto:[email protected]>
             References: <[email protected]
             <mailto:[email protected]>>

             Actually, the "rule of thumb" is a computational necessity.
More
             correct is Jim's formulation that the "degrees of freedom
of the

within-group covariance matrix to be greater than thenumber of

             variables". Otherwise you cannot invert the covariance
             matrix and
             hence cannot compute the CVA. But sample size should be much
             larger
             than the number of variables in order to produce interpretable
             results. If the sample size is close to the number of
             variables, CVA
             will always separate groups even if the share the same mean
             configuration.

             But for 65 populations no low-dimensional representation
will be
             sufficient to distinguish between ALL groups. Furthermore,
             CVA assumes
             equal covariance matrices for all groups, which seems
             unlikely for so
             many populations. If the covariance structures vary
             considerably, a
             pooled estimate may be close to a spherical distribution
and the
             resulting CVA would be very similar to a principal component
             analysis
             (PCA). I would thus suggest to proceed with a PCA, also
             because there
             are no restriction on sample size and statistical artifacts
             are less
             likely.

             I hope this helps,

             Philipp




             Am 01.04.2009 um 19:33 schrieb morphmet:



                 -------- Original Message --------
                 Subject: Re: CVA limitations?
                 Date: Wed, 1 Apr 2009 09:15:46 -0700 (PDT)
                 From: andrea cardini <[email protected]
                 <mailto:[email protected]>>
                 To: [email protected]
                 <mailto:[email protected]>

                 Dear James,
                 on a similar issue there was an exchange of emails in
                 MORPHMET some  time
                 ago (February, I think) and a few more emails which were
                 not sent to  the
                 list. Jim Rohlf suggested to summarize the main points
                 in an email to
                 MORPHMET and I agree with him that it's a very good
                 idea.  Unfortunately I am too busy right now for this
                 but hope to do it  soon or later.

                 Just a couple of quick comments (which greatly
                 oversimplify the problem).
                 First of all, give a look at assumptions of DA/CVA. With
                 many groups  and
                 small samples they're often difficult to test.
                 Second point, from a message that Jim Rohlf sent a
                 couple of years  ago:
                 "... in order use methods that look at difference among
                 groups  relative to
                 within-group variability one needs the degrees of
                 freedom of the
                 within-group covariance matrix to be greater than the
                 number of variables.
                 With fewer observations the within-group covariance
                 matrix will be
                 singular. This rule gives a minimum sample size but for
                 reliable  results
                 the sample size should, of course, be much larger". To
                 have more reliable
                 results, there's a rule of thumb which is suggested in
                 many  textbooks (and
                 I am not sure if it is actually supported by studies):
                 this is that within
                 each group you should have more specimens than variables.
                 Last comment, if you really want to do a DA/CVA when N
                 is not very large,
                 I'd carefully check if results are stable when you
                 exclude small  groups and
                 I'd always cross-validate all analyses. If you find that
                 despite
                 significance, cross-validated hit ratios (i.e.,
                 percentages of  specimens
                 correctly classified according to groups) are low, I'd
                 be very  cautious
                 about what those differences really mean (if they do
                 mean anything  at all).

                 There's plenty of references on this stuff. An old one
                 which I  greatly like
                 is Neff & Marcus' chapter on DA/CVA in their book on
                 "Multivariate Methods
                 for Systematics" (1980).

                 Good luck with your research.
                 Cheers

                 Andrea

                 At 09:01 01/04/2009 -0400, you wrote:



                     -------- Original Message --------
                     Subject: CVA limitations?
                     Date: Tue, 31 Mar 2009 18:20:40 -0700 (PDT)
                     From: J. Willacker <[email protected]
                     <mailto:[email protected]>>
                     To: Morphmet <[email protected]
                     <mailto:[email protected]>>



                     Hi,

                     I was wondering if there were any limits to the
                     number of groups that
                     can be distinguished between with CVA?  I'm
                     comparing facial  morphology
                     in 65 populations of threespine stickleback fish,
                     but don't know if  CVA
                     is valid with so many groups.  Is there a relation
                     between number of
                     specimens per group and how many groups can be
                     compared?  At some  point
                     does the power of the analysis suffer?  Really need
                     help with this since
                     nobody in our stats department seems to know the
                     answer.  Feel free  to
                     respond to [email protected]
                     <mailto:[email protected]>
                     <mailto:[email protected]
                     <mailto:[email protected]>>  Thanks, James

                     --
                     Replies will be sent to the list.
                     For more information visit
                     http://www.morphometrics.org
                     <http://www.morphometrics.org/>







                 --
                 Replies will be sent to the list.
                 For more information visit http://www.morphometrics.org
                 <http://www.morphometrics.org/>




             ____________________________________

             Dr. Philipp Mitteröcker

             Department of Theoretical Biology
             University of Vienna
             Althanstrasse 14
             A-1090 Vienna, Austria

             Tel: +43 1 4277 56705
             Fax: +43 1 4277 9544
             [email protected]
             <mailto:[email protected]>
             www.virtual-anthropology.com/Members/philippm
             <http://www.virtual-anthropology.com/Members/philippm>












             --
             Replies will be sent to the list.
             For more information visit http://www.morphometrics.org
             <http://www.morphometrics.org/>





         --
         Replies will be sent to the list.
         For more information visit http://www.morphometrics.org
         <http://www.morphometrics.org/>




     Dr. Andrea Cardini

     Lecturer in Animal Biology
     Museo di Paleobiologia e dell'Orto Botanico, Universitá di Modena e
     Reggio
     Emilia
     via Università 4, 41100, Modena, Italy
     tel: 0039 059 2056532; fax: 0039 059 2056535

     Honorary Fellow
     Functional Morphology and Evolution Unit, Hull York Medical School
     University of Hull, Cottingham Road, Hull, HU6 7RX, UK
     University of York, Heslington, York YO10 5DD, UK

     E-mail address: [email protected]
     <mailto:[email protected]>, [email protected]
     <mailto:[email protected]>,
     [email protected] <mailto:[email protected]>
     http://hyms.fme.googlepages.com/drandreacardini

http://ads.ahds.ac.uk/catalogue/archive/cerco_lt_2007/overview.cfm#metadata

     More on publications at:
     http://www.cons-dev.org/marm/MARM/EMARM/framarm/framarm.html
     CLICK ON THE LETTER C AND LOOK FOR "CARDINI" (p. 8-9 until March 2009)
     http://hyms.fme.googlepages.com/dr.sarahelton-publications
     LOOK FOR "CARDINI"











     --
     Replies will be sent to the list.
     For more information visit http://www.morphometrics.org
     <http://www.morphometrics.org/>



--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org

--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org

Re: CVA limitations?

Reply via email to