-------- Original Message -------- Subject: RE: procrustes variances Date: Tue, 1 Sep 2009 11:32:37 -0700 (PDT) From: F. James Rohlf <[email protected]> Reply-To: <[email protected]> Organization: Stony Brook University To: <[email protected]> References: <[email protected]> For most multivariate analyses one views the GPA step as a preliminary mathematical transformation from the curved Kendall shape space to the linear tangent space required for the application of standard linear multivariate methods. Once the data are projected into the tangent space, one then applies any appropriate multivariate methods whether they involve resampling or not. If different datasets are completely separate and comparisons are only made within datasets then it makes sense to perform the GPA separately. However, if the analysis involves comparisons between different samples then you would want to perform a single overall GPA so that all the data are in the same multivariate space. If there is very little variation in shape then you might not notice much difference. However, if shape variation is rather large (as may be the case in the simulations) then those parts of Kendall's shape space further from the mean shape do get stretched when the tangent space is constructed. That would give them larger within-sample variances. That is probably the bias that you have detected. When shape variation is not small you may be forced to perform the statistical comparisons on the curved shape space using methods that are not yet as fully developed. ========================= F. James Rohlf Distinguished Professor, Stony Brook University http://life.bio.sunysb.edu/ee/rohlf
-----Original Message----- From: morphmet [mailto:[email protected]] Sent: Tuesday, September 01, 2009 12:20 PM To: morphmet Subject: Re: procrustes variances -------- Original Message -------- Subject: Re: procrustes variances Date: Tue, 1 Sep 2009 08:41:20 -0700 (PDT) From: Haber Annat <[email protected]> To: <[email protected]> Hello, Do you (or any one else) mind explaining this in more details? On the face of it I would think that it makes more sense to superimpose each sample separately, for exactly the reason that Louis gave. This is assuming: 1) the samples are based on exactly the same set of landmarks; 2) you're only comparing the total variance within each sample and not around each landmark, and not anything else; 3) none of the samples include any of the other samples (and maybe other assumptions I'm forgetting right now). If that is wrong to do then what about bootstrapping? After all, as far as I understand, whenever you bootstrap a sample in order, for example, to get confidence interval of something, you re-superimpose every bootstrapped sample but you don't superimpose all 500 or so bootstrapped samples simultaneously, do you? Then if you pile up all these bootstrapped values into one distribution, that implies that they are comparable, doesn't it? Even though they're each based on a somewhat different space. I tried to test it using simulations as well as looking into my own data. So I generated two samples with somewhat different mean shape and level of variance (based on a non-isotropic covariance matrix), but the same sample size, and compared the variance within each sample when they are superimposed together as opposed to when they are superimposed separately (repeated 100 times). With the real data I simply took the same number of males and females from each of 9 species (ruminants, what else) where I know there is significant sexual dimorphism. With the simulations, the variances based on the combined GPA correlate very tightly (>0.92) with the ones based on separate GPA but are consistently slightly overestimated for both samples in each pair. The higher the variance the more they are overestimated relative to their variance when superimposed alone. And also, the higher the difference in mean shape the bigger the bias. However, when one of the samples is five times bigger, the bigger sample's variances correlate perfectly while the smaller samples are vastly overestimated and correlate ~ 0.8. With the empirical data I get exactly the same variances for each of the samples either way I calculate it, but that may be because the mean shape is not that different after all. So all in all, I can see that the space they're in makes a difference but I still don't see why it wouldn't be better to compare their within-sample variance based on separate superimposition for each sample, especially since there is a consistent bias (although these brief simulations obviously didn't cover all possible scenarios). And what does that mean for bootstrapping procedures? Thanks Annat > From: morphmet <[email protected]> > Reply-To: <[email protected]> > Date: Fri, 28 Aug 2009 07:42:29 -0400 > To: morphmet <[email protected]> > Subject: RE: procrustes variances > Resent-From: <[email protected]> > Resent-Date: Fri, 28 Aug 2009 04:45:01 -0700 (PDT) > > > > -------- Original Message -------- > Subject: RE: procrustes variances > Date: Thu, 27 Aug 2009 19:08:25 -0700 (PDT) > From: F. James Rohlf <[email protected]> > Reply-To: [email protected] > Organization: Stony Brook University > To: [email protected] > References: <[email protected]> > > You should only compare them when computed after a combined GPA. > Otherwise you are comparing quantities computed in different spaces. > > ------------------------ > F. James Rohlf, Distinguished Professor > Ecology & Evolution, Stony Brook University > www: http://life.bio.sunysb.edu/ee/rohlf > >> -----Original Message----- >> From: morphmet [mailto:[email protected]] >> Sent: Thursday, August 27, 2009 11:32 AM >> To: morphmet >> Subject: procrustes variances >> >> >> >> -------- Original Message -------- >> Subject: procrustes variances >> Date: Thu, 27 Aug 2009 07:41:39 -0700 (PDT) >> From: Louis Boell <[email protected]> >> To: <[email protected]> >> >> >> >> Dear colleagues, >> >> I have a question about procrustes variance. I want to compare the >> shape >> variances of different samples. I have three groups of 77, 96 and 17 >> specimens, respectively. I calculated the procrustes variance of each >> group in two ways: 1) after pooling the raw data of all three groups >> into a common total dataset and fitting them together; b) after >> calculating the procrustes fit for each dataset/group separately. >> >> The results for the two large samples are quite consistent between both >> procedures; however, the estimate of the procrustes variance for the 17 >> specimen sample is much larger when fitted together with the other two >> samples than when fitted separately. >> >> I assume that this is because the procrustes fit is a "democratic" >> procedure, which is much more influenced by large samples than by small >> samples when they are fitted together. This could potentially result in >> a "spreading" of the specimens from the specimens from the small sample >> in the space of the procrustes coordinates, if their covariance pattern >> is different from the mean covariance pattern of the total dataset >> which >> will be largely determined by the large samples. >> >> Altogether, my question amounts to whether it is more approriate to >> compare procrustes variances from separate procrustes fits or from a >> procrustes fit of the pooled total dataset. >> >> Thanks in advance for help >> >> Louis Boell >> >> >> >> Louis Boell >> MPI für Evolutionsbiologie >> August-Thienemannstr.2 >> 24306 Plön >> [email protected] >> [email protected] >> >> >> >> -------------------------------------------------------------- --------- >> - >> Mehr wissen - besser reisen. >> <http://redirect.gimas.net/?n=M0908aReisen1> >> >> -- >> Replies will be sent to the list. >> For more information visit http://www.morphometrics.org > > > > > -- > Replies will be sent to the list. > For more information visit http://www.morphometrics.org > -- Replies will be sent to the list. For more information visit http://www.morphometrics.org
-- Replies will be sent to the list. For more information visit http://www.morphometrics.org
