-------- Original Message --------
Subject: RE: Size-correction using regression residuals
Date: Wed, 30 Sep 2009 11:10:41 -0700 (PDT)
From: Cabo-Perez, Luis <[email protected]>
To: [email protected] <[email protected]>
References: <[email protected]>
Rebecca,
I don't know if I am getting right what you exactly did but, just as a
quick thought, if you used the residuals of predicted versus observed
centroid size, the result would be the variation in size, rather than in
shape, not explained by allometry. If this is what you are describing
(and, again, I am not sure of having got it right), what you would have
estimated in your within-group analysis would be body condition. Many of
the big guys in your within group sample are likely to have a better
body condition (more time to feed and less predatory stress, more food
sources -maybe even including the little guys, if you have many
different age classes... if all from similar age classes, the big guys
are likely to be bigger just because of factors affecting body condition
itself-, etc), so are likely to still being big guys after your
correction. You may want to regress your variables on centroid size one
by one, rather than in a multivariate regression (similar, at least in
spirit, to Burnaby's transformation), or simply using a PCA adding
centroid size as a variable, and then entering into your DFA just those
components not showing high communalities with size. However, if what
you want is simply trying to figure out what are the main size-free
differences between the different groups, I think the easiest and
cleanest solution would be just sticking to the DFA with the original
procrustes, and then checking the correlation of each of the roots with
centroid size. My guess is that, if you have high size variability, the
first canonical root will still be expressing mainly allometric size,
and most of the remaining ones will already be shape variables. I
personally prefer first looking at the variability, and then trying to
figure out its sources.
Interesting problem anyway. I am curious about whether the apparently
exaggerated group differences also result in better percentages of
correct classification or not. The question is: should we stick to the
classification method providing best correct classification rates,
independently of whether it makes biological sense, or not?
And then, if I got wrong what you actually did, dismiss this one and
feel free to send me a dunce hat. If you used residuals for independent
variables on centroid size, then I cannot see a clear explanation,
other than problems of linearity in the bivariate regressions, or
variability in body condition, combined with a wide range of age
classes, playing a big role. But I am still in favor of using PCA's or
other variable reduction methods first, if what you want is simply
controlling for allometric size, and leaving initial regressions on the
original variables (linear or landmark) for the study of allometry
itself. I am very curious about other people's take on the problem.
Cheers,
Luis
________________________________
Luis Cabo
Mercyhurst Archaeological Institute,
Department of Applied Forensic Sciences,
Mercyhurst College,
Erie, PA
E-mail: [email protected]
-------- Original Message --------
Subject: Size-correction using regression residuals
Date: Wed, 30 Sep 2009 04:28:27 -0700 (PDT)
From: Rebeca <[email protected]>
To: <[email protected]>
Dear morphometricians,
I am studying morphological variation of a fish species around the
Iberian Peninsula in order to identify possible phenotypic stocks, using
13 landmarks to analyze shape. I have samples from 7 different
locations, but as the size composition in my samples differs among
locations, I want to eliminate the variation associated with size before
I do a discriminant analysis.
For that purpose, I have been using the residuals from a multivariate
pooled-within group regression of shape (Procrustes coordinates) on size
(Centroid size) to obtain “size-free” variables that can be used as
input in the discriminant function analysis (DFA).
However, I have noticed that when I use these residuals, the degree of
separation among my samples in the DFA seems to be a bit exaggerated
(i.e. samples that are supposed to be similar appear to be completely
separated from each other).
Because of that, I decided to do an exercise to see the effect of
size-correction on the discrimination of my samples:
First, I separated some specimens from the same location into 2 groups
of “big” and “small” specimens:
Group 1: 74 specimens with a centroid size >20 cm
Group 2: 70 specimens with a centroid size <13 cm
Then, I did a DFA with the 2 groups: first with the shape variables
without size-correction and then using the residuals that I mention above.
As expected, I found differences between the 2 groups when no
size-correction was done, due to the differences in size composition of
the samples.
After size-correction, I expected no differences between the two groups,
since all specimens (big and small) come from the same location;
however, in the DFA the degree of separation between the 2 groups
increased even more than when I used no size-correction!
I am very interested to know if someone has had a similar experience
with size-correction or if anyone can comment on what could be happening
here.
Thanks in advance,
Rebeca
Rebeca P. Rodriguez Mendoza, PhD candidate
Instituto de Investigaciones Marinas CSIC- Fisheries group
C/ Eduardo Cabello, 6
Tel. 986 23 19 30 ext. 240
Fax: 986 29 27 62
36 208 Vigo, ESPAÑA
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org