-------- Original Message --------
Subject: RE: procrustes variances
Date: Tue, 1 Sep 2009 11:32:37 -0700 (PDT)
From: F. James Rohlf <[email protected]>
Reply-To: <[email protected]>
Organization: Stony Brook University
To: <[email protected]>
References: <[email protected]>

For most multivariate analyses one views the GPA step as a preliminary
mathematical transformation from the curved Kendall shape space to the
linear tangent space required for the application of standard linear
multivariate methods. Once the data are projected into the tangent
space, one then applies any appropriate multivariate methods whether
they involve resampling or not.

If different datasets are completely separate and comparisons are only
made within datasets then it makes sense to perform the GPA
separately. However, if the analysis involves comparisons between
different samples then you would want to perform a single overall GPA
so that all the data are in the same multivariate space.

If there is very little variation in shape then you might not notice
much difference. However, if shape variation is rather large (as may
be the case in the simulations) then those parts of Kendall's shape
space further from the mean shape do get stretched when the tangent
space is constructed. That would give them larger within-sample
variances. That is probably the bias that you have detected. When
shape variation is not small you may be forced to perform the
statistical comparisons on the curved shape space using methods that
are not yet as fully developed.

=========================
F. James Rohlf
Distinguished Professor, Stony Brook University
http://life.bio.sunysb.edu/ee/rohlf

-----Original Message-----
From: morphmet [mailto:[email protected]]
Sent: Tuesday, September 01, 2009 12:20 PM
To: morphmet
Subject: Re: procrustes variances



-------- Original Message --------
Subject: Re: procrustes variances
Date: Tue, 1 Sep 2009 08:41:20 -0700 (PDT)
From: Haber Annat <[email protected]>
To: <[email protected]>

Hello,
Do you (or any one else) mind explaining this in more details?
On the face of it I would think that it makes more sense to
superimpose
each sample separately, for exactly the reason that Louis gave.
This is
assuming: 1) the samples are based on exactly the same set of
landmarks;
2) you're only comparing the total variance within each sample
and not
around each landmark, and not anything else; 3) none of the
samples
include any of the other samples (and maybe other assumptions I'm
forgetting right now). If that is wrong to do then what about
bootstrapping? After all, as far as I understand, whenever you
bootstrap
a sample in order, for example, to get confidence interval of
something,
you re-superimpose every bootstrapped sample but you don't
superimpose
all 500 or so bootstrapped samples simultaneously, do you? Then
if you
pile up all these bootstrapped values into one distribution, that
implies that they are comparable, doesn't it? Even though they're
each
based on a somewhat different space. I tried to test it using
simulations as well as looking into my own data. So I generated
two
samples with somewhat different mean shape and level of variance
(based
on a non-isotropic covariance matrix), but the same sample size,
and
compared the variance within each sample when they are
superimposed
together as opposed to when they are superimposed separately
(repeated
100 times). With the real data I simply took the same number of
males
and females from each of 9 species (ruminants, what else) where I
know
there is significant sexual dimorphism. With the simulations, the
variances based on the combined GPA correlate very tightly
(>0.92) with
the ones based on separate GPA but are consistently slightly
overestimated for both samples in each pair. The higher the
variance the
more they are overestimated relative to their variance when
superimposed
alone. And also, the higher the difference in mean shape the
bigger the
bias. However, when one of the samples is five times bigger, the
bigger
sample's variances correlate perfectly while the smaller samples
are
vastly overestimated and correlate ~ 0.8.

With the empirical data I get exactly the same variances for each
of the
samples either way I calculate it, but that may be because the
mean
shape is not that different after all.

So all in all, I can see that the space they're in makes a
difference
but I still don't see why it wouldn't be better to compare their
within-sample variance based on separate superimposition for each
sample, especially since there is a consistent bias (although
these
brief simulations obviously didn't cover all possible scenarios).
And
what does that mean for bootstrapping procedures?
Thanks
Annat


> From: morphmet <[email protected]>
> Reply-To: <[email protected]>
> Date: Fri, 28 Aug 2009 07:42:29 -0400
> To: morphmet <[email protected]>
> Subject: RE: procrustes variances
> Resent-From: <[email protected]>
> Resent-Date: Fri, 28 Aug 2009 04:45:01 -0700 (PDT)
>
>
>
> -------- Original Message --------
> Subject: RE: procrustes variances
> Date: Thu, 27 Aug 2009 19:08:25 -0700 (PDT)
> From: F. James Rohlf <[email protected]>
> Reply-To: [email protected]
> Organization: Stony Brook University
> To: [email protected]
> References: <[email protected]>
>
> You should only compare them when computed after a combined
GPA.
> Otherwise you are comparing quantities computed in different
spaces.
>
> ------------------------
> F. James Rohlf, Distinguished Professor
> Ecology & Evolution, Stony Brook University
> www: http://life.bio.sunysb.edu/ee/rohlf
>
>> -----Original Message-----
>> From: morphmet [mailto:[email protected]]
>> Sent: Thursday, August 27, 2009 11:32 AM
>> To: morphmet
>> Subject: procrustes variances
>>
>>
>>
>> -------- Original Message --------
>> Subject:  procrustes variances
>> Date:  Thu, 27 Aug 2009 07:41:39 -0700 (PDT)
>> From:  Louis Boell <[email protected]>
>> To:  <[email protected]>
>>
>>
>>
>> Dear colleagues,
>>
>> I have a question about procrustes variance. I want to compare
the
>> shape
>> variances of different samples. I have three groups of 77, 96
and 17
>> specimens, respectively. I calculated the procrustes variance
of each
>> group in two ways: 1) after pooling the raw data of all three
groups
>> into a common total dataset and fitting them together; b)
after
>> calculating the procrustes fit for each dataset/group
separately.
>>
>> The results for the two large samples are quite consistent
between both
>> procedures; however, the estimate of the procrustes variance
for the 17
>> specimen sample is much larger when fitted together with the
other two
>> samples than when fitted separately.
>>
>> I assume that this is because the procrustes fit is a
"democratic"
>> procedure, which is much more influenced by large samples than
by small
>> samples when they are fitted together. This could potentially
result in
>> a "spreading" of the specimens from the specimens from the
small sample
>> in the space of the procrustes coordinates, if their
covariance pattern
>> is different from the mean covariance pattern of the total
dataset
>> which
>> will be largely determined by the large samples.
>>
>> Altogether, my question amounts to whether it is more
approriate to
>> compare procrustes variances from separate procrustes fits or
from a
>> procrustes fit of the pooled total dataset.
>>
>> Thanks in advance for help
>>
>> Louis Boell
>>
>>
>>
>> Louis Boell
>> MPI für Evolutionsbiologie
>> August-Thienemannstr.2
>> 24306 Plön
>> [email protected]
>> [email protected]
>>
>>
>>
>> --------------------------------------------------------------
---------
>> -
>> Mehr wissen - besser reisen.
>> <http://redirect.gimas.net/?n=M0908aReisen1>
>>
>> --
>> Replies will be sent to the list.
>> For more information visit http://www.morphometrics.org
>
>
>
>
> --
> Replies will be sent to the list.
> For more information visit http://www.morphometrics.org
>




--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org




--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org

Reply via email to