Re: [MORPHMET2] Measurement error in geometric morphometrics

Mike Collyer Wed, 09 Nov 2022 12:19:51 -0800

Well, Andrea, it appears that your empirical optimism trumps my theoretical 
cynicism!   I probably could have chosen better shapes — wanted a simple 
example that seemed to comply with your 1/30th Rsq rule of thumb — used 
different sample sizes, had more “species” than a square and rectangle, made 
sure the Euclidean distance in tangent space tracked Procrustes distances 
better, assured homogeneity of variance and if all that, and only after that, 
illustrated that systematic measurement error persisted even if it was 
apparently subsumed by a small Rsq in ANOVA.  (But the Residual Rsq might be 
higher, which I assume after you proposed 1/30th of the individual Rsq would 
not be consistent with the level of shape variation you felt was warranted.  
Hence the extreme simulation.  I did use smaller disparity in shape and the 
pattern holds.)  The point was not to find an infallible example but to show 
that (1) systematic measurement error, even if apparently small can still be a 
problem, and (2) the systematic bias can align with other signals, something 
that would not be picked up in an ANOVA table.


I’ll offer an additional example based on real experience in my lab, as 
something I hope is a bit of allegory (although I sure don’t try to persuade 
you, Andrea — others might be interested).  I once had a cadre of students and 
we needed to photograph and digitize thousands of fish specimens from museum 
collections.  Before embarking, I had students digitize the same set of small, 
minnow-like fish, combined the landmark data, and looked for systematic biases 
in digitization style via GPA then PC plots.  Sure enough, students tended to 
have replicated shifts in points in the plot, due to slight variations in 
style.  (It would not have been easy to wait until we had thousands of 
photographs taken and then randomize images to attempt to force the systematic 
bias to behave more like random error, although randomization like this would 
be ideal.  Based on schedules and the need for some of the same people to 
photograph and digitize, the work had to be more processed by batches.)  We 
identified tendencies and worked with students until the measurement error 
looked more random.  That was comforting, but because the individuals were all 
small fish from just a couple of species, the Rsq remained pretty high.  

If, as an alternative, we had been performing a larger macroevolutionary study 
and included in our systematic bias experiment fish of vastly different shapes, 
maybe ME would be so small that based on your argument, we shrug our shoulders 
and move on.  But now if student A digitized several species that were actually 
similar to species student B digitized, in the same clade, even if the shape 
variation among their combined species was small compared to the larger sample 
that comprised many different species and different clades, is this okay?  If 
we wanted to perhaps measure evolutionary rates would it not be a problem that 
systematic biases only affected a specific clade of similarly shaped fishes?  I 
would argue that we could potentially under- or overestimate evolutionary 
rates, simply because in the pilot test we based our evaluation on an Rsq value 
that obscured the systematic ME.

I know you would probably be cognizant of such things and use different samples 
to align with your focus, but I think it is generally more important that we do 
not lose the theoretical forest for the empirical trees.  Acknowledging a 
systematic bias and ignoring it is one thing.  Not recognizing it is another.  
Failing to consider analyses that might help one to understand if ME has a 
systematic signal (rather than just not care if the signal is systematic or 
random) would be unfortunate.  So if somebody has an empirical data set that 
not only has minnows but maybe also some puffers, sharks, eels, and ocean 
sunfish, I sure hope they would not scoff at ME because the Rsq for repeated 
measures is small.  Sure, the diagnostic steps (plus others) that you performed 
should be done but not doing those things in the example I provided does not 
create suspicion that a correlated shifts in position in the PC plot are 
spurious; they reflected what was simulated.

By the way, the heterogeneity in variance between squares and rectangles is 
from scaling to unit size configurations that were much different size but 
simulated with the same level of sd at the points.  If I had not been working 
quick, I might have thought about that and made templates more similar in size 
or varied the simulation sd in proportion to the object size.  I hope you are 
not implying that by not doing that, the simulated bias is invalidated.

Cheers!
Mike

> On Nov 9, 2022, at 1:27 PM, alcardini <[email protected]> wrote:
> 
> Dear Mike,
> thanks for the interesting example.
> My answers:
> 
> 1) Before worrying about ME, if I had those data, I'd worry about the poor 
> tangent space approximation. 
> <image.png>
> 
> 2) I'd also worry that, for comparing groups (squares vs rectangles), all the 
> tests I use require homogeneity of variance and covariance. Variance (trace 
> of the matrix) in the squares is ca 50 times larger than in the rectangles 
> (when I re-run you script). One sees that also in the PC1-PC2 scatterplot, if 
> you color the groups:
> <image.png>
> 3) I would worry much less about the bias in this specific example. If the hp 
> I am testing is group differences, yes, the bias slightly inflates the 
> differences. Does that change my conclusions? I'd say no, because differences 
> are so huge that groups are perfectly separated regardless of the bias
> <image.png>
> and in the visualization the mean of squares looks like ca. a square:
> <image.png>
> and the mean of rectangles looks like a very elongated rectangle:
> <image.png>
> as in the model that generated the data.
> 
> 
> It is a nice example. I would not be happy about the bias. But where you see 
> the glass half empty, I see it almost completely full.
> The issue of biases in ME is important. I do look forward to reading papers 
> that develop and detail methods to assess them (including how biases affect 
> the assumptions of the models, not just the specific hypothesis being tested).
> 
> For now, my main concern remain whether one has a flaw in the experimental 
> design and a relevant source of ME  gets undetected. Probably we should spend 
> more time on this and develop checklists and protocols that help in the most 
> common cases.
> Cheers
> 
> Andrea
> 
> 
> 
> On Tue, 8 Nov 2022 at 19:16, Mike Collyer <[email protected] 
> <mailto:[email protected]>> wrote:
>> Dear Andrea,
>> 
>> I have to argue against one of your points.
>>> 
>>> Nevertheless, I could miss a bias, but if ME has an Rsq of, say, less than 
>>> 1/30 of individual variation within species, when I test species the bias 
>>> will be negligible. This is, if I am correct, what you implied when wrote 
>>> that "one can argue that if measurement error is very small, then 
>>> randomness and homogeneity across groups are less of an issue”.
>> 
>> If we come full-circle to Philipp’s first point — that choice of individuals 
>> can mislead one’s interpretation — I believe it is  dangerous to use a value 
>> of Rsq to conclude systematic ME (bias) is negligible.  I hope I can 
>> demonstrate this with an example (in R).
>> 
>> To set this up, I create 10 shapes based on a template that is a square.  I 
>> then add a digitizing bias by shifting two of the four landmarks (plus some 
>> random error).
>> 
>> > # Create 10 specimens
>> > 
>> > coords1 <- lapply(1:10, function(.) mat + rnorm(8, sd = 1))
>> > 
>> > # Add digitizing bias for each, shifting two landmarks a little right
>> > # plus add a little random error
>> > 
>> > coords2 <- lapply(coords1, function(x) 
>> +   x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + rnorm(8, sd 
>> = 0.1))
>> > 
>> > # string together and test for ME
>> > 
>> > lmks <- simplify2array(c(coords1, coords2))
>> > GPA <- gpagen(lmks, print.progress = FALSE)
>> > ind <- factor(c(rep(1:10, 2)))
>> > summary(procD.lm(coords ~ ind, data = GPA))
>> 
>> Analysis of Variance, using Residual Randomization
>> Permutation procedure: Randomization of null model residuals 
>> Number of permutations: 1000 
>> Estimation method: Ordinary Least Squares 
>> Sums of Squares and Cross-products: Type I 
>> Effect sizes (Z) based on F distributions
>> 
>>           Df      SS       MS     Rsq    F      Z Pr(>F)   
>> ind        9 1.54733 0.171926 0.94906 20.7 5.5944  0.001 **
>> Residuals 10 0.08306 0.008306 0.05094                      
>> Total     19 1.63039                                       
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Call: procD.lm(f1 = coords ~ ind, data = GPA)
>> 
>> 
>> 
>> If we plot PC scores, the systematic bias is obvious:
>> 
>> 
>> 
>> > # plot PC scores, with lines showing systematic ME
>> > 
>> > PCA <- gm.prcomp(GPA$coords)
>> > plot(PCA, pch = 19, asp = 1, col = rep(1:2, each = 10))
>> > 
>> > for(i in 1:10) {
>> +   points(rbind(PCA$x[i,], PCA$x[10 + i,]),
>> +          type = "l",
>> +          lty = 3)
>> + }
>> 
>> <PastedGraphic-1.tiff>
>> 
>> So one might see the bias in the plot and the 5% ME — if we want to call it 
>> that based on Rsq in the ANOVA — might be too high for one’s comfort.  But 
>> now let's repeat the process on 10 specimens using instead of a square 
>> template, a long rectangle. 
>> 
>> 
>> > # Now add some more individuals to the mix, perhaps from
>> > # a much differently shaped species (long rectangle, not square)
>> > # using the same strategy
>> > 
>> > mat3 <- matrix(c(0, 0, 50, 0, 0, 5, 50, 5), 4, 2, byrow = T)
>> > coords3 <- lapply(1:10, function(.) mat3 + rnorm(8, sd = 1))
>> > coords4 <- lapply(coords3, function(x) 
>> +   x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + rnorm(8, sd 
>> = 0.1))
>> > 
>> > 
>> > lmks <- simplify2array(c(coords1, coords2, coords3, coords4))
>> > GPA <- gpagen(lmks, print.progress = FALSE)
>> > ind <- factor(c(rep(1:10, 2), rep(11:20, 2)))
>> > summary(procD.lm(coords ~ ind, data = GPA))
>> 
>> Analysis of Variance, using Residual Randomization
>> Permutation procedure: Randomization of null model residuals 
>> Number of permutations: 1000 
>> Estimation method: Ordinary Least Squares 
>> Sums of Squares and Cross-products: Type I 
>> Effect sizes (Z) based on F distributions
>> 
>>           Df     SS       MS     Rsq     F      Z Pr(>F)   
>> ind       19 4.9087 0.258351 0.98567 72.39 8.8918  0.001 **
>> Residuals 20 0.0714 0.003569 0.01433                       
>> Total     39 4.9801                                        
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Call: procD.lm(f1 = coords ~ ind, data = GPA)
>> > 
>> > 
>> > PCA <- gm.prcomp(GPA$coords)
>> > P <- plot(PCA, pch = c(rep(19, 20), rep(20, 20)), asp = 1, col = 
>> > rep(rep(1:2, each = 10), 2))
>> > 
>> > for(i in 1:10) {
>> +   points(rbind(PCA$x[i,], PCA$x[10 + i,]),
>> +          type = "l",
>> +          lty = 3)
>> + }
>> 
>> <PastedGraphic-2.tiff>
>> 
>> 
>> Note that the corresponding 10 vectors are shown in this PC plot as in the 
>> first, but 20 more values have been added (the cluster of points to the 
>> right).  The mean is no longer the mean of 20 square-like shapes, but is the 
>> mean of 40 rectangles, with the square-like shapes now having negative PC 
>> scores in the plot.  Square shapes and long rectangle shapes are clearly 
>> separated in this plot.  Here is a transformation grid (scaled 1x) for the 
>> approximate middle of the points on the left:
>> 
>> <PastedGraphic-3.png>
>> 
>> and the same for the cluster of points on the right:
>> 
>> <PastedGraphic-4.png>
>> 
>> But let’s pay attention to the same 20 configurations in both plots.  Now 
>> the systematic ME is clearly associated with the first PC, which is also 
>> representing more of the overall shape variation, and the signal remains 
>> even though the ANOVA results suggest this is no big deal (1.4 % of 
>> variation).  Worse, the bias now appears to be associated with, e.g., 
>> species differences. 
>> 
>> The bias in this example did not become negligible in spite of changing the 
>> sample, and in spite of a conclusion to the contrary that might be made with 
>> ANOVA results.  Again, evaluating the relative portion of variance explained 
>> (especially if based on dispersion of points, alone) is dangerous, and a 
>> comforting statistic should not be sufficient evidence to not worry about a 
>> systematic measurement error.
>> 
>> Best,
>> Mike
>> 
>> 
> 
> 
> -- 
> E-mail address: [email protected] <mailto:[email protected]>, 
> [email protected] <mailto:[email protected]>
> WEBPAGE: https://sites.google.com/view/alcardini2/
> or https://tinyurl.com/andreacardini
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Morphmet" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/morphmet2/CAJ__j7PtwTnGWOXkn1naJ%2B6UEt8mcQ-cTXG42qtsqCP%3DUqADOQ%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/morphmet2/CAJ__j7PtwTnGWOXkn1naJ%2B6UEt8mcQ-cTXG42qtsqCP%3DUqADOQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/7A818777-1FC0-4B06-BFA4-8463A62658C3%40gmail.com.

Re: [MORPHMET2] Measurement error in geometric morphometrics

Reply via email to