I wonder whether it would help to be more strict about the use of the word 
"bias". There is the statistical meaning where there is a problem with the 
statistical estimate estimate being used. Must have to treat and correct for 
that differently than if the problem is that the investigator is making the 
measurements themselves incorrectly. With a statistic one can investigate 
properties assuming various statistical distributions. Not sure how to 
investigate theoretically the effect of an investigator who systematically 
measures something a little differently than intended or at least differently 
from other investigators working on the same or similar material. They are 
effectively measuring a different variable.  Suggestions for a different 
word?__________________F. James Rohlf, Distinguished Prof. Emeritus Dept. 
Anthropology and Ecology & Evolution Stonybrook University
-------- Original message --------From: Mike Collyer <[email protected]> 
Date: 11/8/22  1:16 PM  (GMT-05:00) To: andrea cardini <[email protected]> 
Cc: [email protected] Subject: Re: [MORPHMET2] Measurement error in 
geometric morphometrics Dear Andrea,I have to argue against one of your 
points.Nevertheless, I could miss a bias, but if ME has an Rsq of, say, less 
than 1/30 of individual variation within species, when I test species the bias 
will be negligible. This is, if I am correct, what you implied when wrote that 
"one can argue that if measurement error is very small, then randomness and 
homogeneity across groups are less of an issue”.If we come full-circle to 
Philipp’s first point — that choice of individuals can mislead one’s 
interpretation — I believe it is  dangerous to use a value of Rsq to conclude 
systematic ME (bias) is negligible.  I hope I can demonstrate this with an 
example (in R).To set this up, I create 10 shapes based on a template that is a 
square.  I then add a digitizing bias by shifting two of the four landmarks 
(plus some random error).> # Create 10 specimens> > coords1 <- lapply(1:10, 
function(.) mat + rnorm(8, sd = 1))> > # Add digitizing bias for each, shifting 
two landmarks a little right> # plus add a little random error> > coords2 <- 
lapply(coords1, function(x) +   x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, 
byrow = T) + rnorm(8, sd = 0.1))> > # string together and test for ME> > lmks 
<- simplify2array(c(coords1, coords2))> GPA <- gpagen(lmks, print.progress = 
FALSE)> ind <- factor(c(rep(1:10, 2)))> summary(procD.lm(coords ~ ind, data = 
GPA))Analysis of Variance, using Residual RandomizationPermutation procedure: 
Randomization of null model residuals Number of permutations: 1000 Estimation 
method: Ordinary Least Squares Sums of Squares and Cross-products: Type I 
Effect sizes (Z) based on F distributions          Df      SS       MS     Rsq  
  F      Z Pr(>F)   ind        9 1.54733 0.171926 0.94906 20.7 5.5944  0.001 
**Residuals 10 0.08306 0.008306 0.05094                      Total     19 
1.63039                                       ---Signif. codes:  0 ‘***’ 0.001 
‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Call: procD.lm(f1 = coords ~ ind, data = GPA)If 
we plot PC scores, the systematic bias is obvious:> # plot PC scores, with 
lines showing systematic ME> > PCA <- gm.prcomp(GPA$coords)> plot(PCA, pch = 
19, asp = 1, col = rep(1:2, each = 10))> > for(i in 1:10) {+   
points(rbind(PCA$x[i,], PCA$x[10 + i,]),+          type = "l",+          lty = 
3)+ }So one might see the bias in the plot and the 5% ME — if we want to call 
it that based on Rsq in the ANOVA — might be too high for one’s comfort.  But 
now let's repeat the process on 10 specimens using instead of a square 
template, a long rectangle. > # Now add some more individuals to the mix, 
perhaps from> # a much differently shaped species (long rectangle, not square)> 
# using the same strategy> > mat3 <- matrix(c(0, 0, 50, 0, 0, 5, 50, 5), 4, 2, 
byrow = T)> coords3 <- lapply(1:10, function(.) mat3 + rnorm(8, sd = 1))> 
coords4 <- lapply(coords3, function(x) +   x + matrix(c(0, 0, 1.5, 0, 0, 0, 
1.5, 0), 4, 2, byrow = T) + rnorm(8, sd = 0.1))> > > lmks <- 
simplify2array(c(coords1, coords2, coords3, coords4))> GPA <- gpagen(lmks, 
print.progress = FALSE)> ind <- factor(c(rep(1:10, 2), rep(11:20, 2)))> 
summary(procD.lm(coords ~ ind, data = GPA))Analysis of Variance, using Residual 
RandomizationPermutation procedure: Randomization of null model residuals 
Number of permutations: 1000 Estimation method: Ordinary Least Squares Sums of 
Squares and Cross-products: Type I Effect sizes (Z) based on F distributions    
      Df     SS       MS     Rsq     F      Z Pr(>F)   ind       19 4.9087 
0.258351 0.98567 72.39 8.8918  0.001 **Residuals 20 0.0714 0.003569 0.01433     
                  Total     39 4.9801                                        
---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Call: 
procD.lm(f1 = coords ~ ind, data = GPA)> > > PCA <- gm.prcomp(GPA$coords)> P <- 
plot(PCA, pch = c(rep(19, 20), rep(20, 20)), asp = 1, col = rep(rep(1:2, each = 
10), 2))> > for(i in 1:10) {+   points(rbind(PCA$x[i,], PCA$x[10 + i,]),+       
   type = "l",+          lty = 3)+ }Note that the corresponding 10 vectors are 
shown in this PC plot as in the first, but 20 more values have been added (the 
cluster of points to the right).  The mean is no longer the mean of 20 
square-like shapes, but is the mean of 40 rectangles, with the square-like 
shapes now having negative PC scores in the plot.  Square shapes and long 
rectangle shapes are clearly separated in this plot.  Here is a transformation 
grid (scaled 1x) for the approximate middle of the points on the left:and the 
same for the cluster of points on the right:But let’s pay attention to the same 
20 configurations in both plots.  Now the systematic ME is clearly associated 
with the first PC, which is also representing more of the overall shape 
variation, and the signal remains even though the ANOVA results suggest this is 
no big deal (1.4 % of variation).  Worse, the bias now appears to be associated 
with, e.g., species differences. The bias in this example did not become 
negligible in spite of changing the sample, and in spite of a conclusion to the 
contrary that might be made with ANOVA results.  Again, evaluating the relative 
portion of variance explained (especially if based on dispersion of points, 
alone) is dangerous, and a comforting statistic should not be sufficient 
evidence to not worry about a systematic measurement error.Best,Mike



-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/C30FAD86-E64E-4AEB-8B8C-041768B131D8%40gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/6377ada3.050a0220.36294.e302%40mx.google.com.

Reply via email to