Re: [MORPHMET2] Measurement error in geometric morphometrics

Mike Collyer Fri, 04 Nov 2022 10:59:22 -0700

I agree with Philipp’s main point that it can be dangerous to quantify 
measurement error as a value based on (likely a ratio including) the variation 
among individuals on which the variation between repeated digitizations is also 
made, if it is not clear how variable those individuals are.  I was seeking 
some examples to demonstrate that small measurement error can look larger when 
individuals are not that different in shape or large measurement error can look 
small if they are.  I was not very successful before Philipp responded.  
However, I did play with the “mosquito” data set in geomorph, which led me in a 
different direction.  I chose this data set because it contains two replicate 
configurations for each individual.


For context, here is the analysis I considered:

> library(geomorph)
> data("mosquito")
> 
> # use just one side for demonstration
> # resdual SS can be considered basis for measurement error
> 
> lmks <- mosquito$wingshape[,, which(mosquito$side == 1)]
> ind <- mosquito$ind[ which(mosquito$side == 1)]
> GPA <- gpagen(lmks, print.progress = FALSE)
> summary(procD.lm(coords ~ ind, data = GPA))

Analysis of Variance, using Residual Randomization
Permutation procedure: Randomization of null model residuals 
Number of permutations: 1000 
Estimation method: Ordinary Least Squares 
Sums of Squares and Cross-products: Type I 
Effect sizes (Z) based on F distributions

          Df       SS        MS     Rsq      F      Z Pr(>F)   
ind        9 0.069286 0.0076984 0.62764 1.8729 2.6261  0.006 **
Residuals 10 0.041105 0.0041105 0.37236                        
Total     19 0.110390                                          
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Call: procD.lm(f1 = coords ~ ind, data = GPA)

It might be alarming that the residual Rsq is 0.37236, which is the portion of 
variation attributed to multiple measurements on the same individuals.  That 
might seem high.  I grew quickly tired of searching for a similar data set with 
contrasting results and decided that maybe I could just simulate measurement 
error and ask if the residual SS here was large compared to what I simulated.  
I thought about this as a process and came to the conclusion that one could 
simulate landmark wobble (like a shaky hand) by making the standard deviation 
of wobble sampled from a normal distribution proportional to a fraction of the 
centroid size.  For example, a 5% error could mean that the standard deviation 
for the distribution from which a random value is sampled (x, y, or z 
coordinate) is 0.05 * CS for that configuration.  (The shakiness scales with 
the size of the object.

I ended up making a function that could simulate a measurement error outcome.  
Here is the function, in case anyone might find it useful (I have not tested 
this, so please expect clunkiness…).  One adds a set of coordinates (assumed to 
be a 3d array), the number of replicates to simulate (the observed counts as 
1), and the percentage of centroid size to use to vary the sd of a random 
sample from a normal distribution.  It performs ANOVA for the simulated data 
(following GPA).


makeME <- function(coords, reps = 2, per.error = 0.05){ # per.error means sd = 
per.error * Csize
  if(reps < 2)
    stop("Must have more than 1 replicate to run this.\n")
  dims <- dim(coords)
  n <- dims[3]
  p <- dims[1]
  k <- dims[2]

  nms <- dimnames(coords)[[3]]
  if(is.null(nms)) nms <- paste("spec", 1:n, sep = "")
  Coords <- lapply(1:n, function(x) as.matrix(coords[,, x]))
  nnms <- paste(rep(nms, each = reps), 1:reps, sep = ".rep")
  
  newCoords <- rep(Coords, each = reps)
  names(newCoords) <- nnms
  initGPA <- gpagen(coords, print.progress = FALSE, max.iter = 1)
  Csize <- rep(initGPA$Csize, each = reps)
  err <- rep(c(0, rep(per.error, reps - 1)),  n)
  for(i in 1:length(err)) newCoords[[i]] <- newCoords[[i]] + rnorm(p * k, sd = 
err[i] * Csize [i])
  newCoords <- simplify2array(newCoords)
  
  GPA <- gpagen(newCoords, print.progress = FALSE)
  
  ind <- factor(rep(1:n, each = reps))
  return(summary(procD.lm(coords ~ ind, data = GPA)))
}

And as an example application, using the same data as above:

> makeME(mosquito$wingshape[,, which(mosquito$side == 1)])

Analysis of Variance, using Residual Randomization
Permutation procedure: Randomization of null model residuals 
Number of permutations: 1000 
Estimation method: Ordinary Least Squares 
Sums of Squares and Cross-products: Type I 
Effect sizes (Z) based on F distributions

          Df      SS       MS     Rsq      F      Z Pr(>F)   
ind       19 0.91707 0.048267 0.56455 1.3647 3.2186  0.002 **
Residuals 20 0.70736 0.035368 0.43545                        
Total     39 1.62442                                         
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Call: procD.lm(f1 = coords ~ ind, data = GPA)

So I might conclude from this that if I allowed my digitizing to vary by 5% of 
centroid size, it appears my observed digitization has a measurement error less 
than that, which might help me to feel confident.  In case I worry that this 
one random outcome is not fully representative, the following function allows 
me to run many simulations (100 as an example)


simulate.makeME <- function(coords, reps = 2, per.error = 0.05, nsims = 100) {
  result <- sapply(1:nsims, function(j) {
    cat("sim:", j, "... ")
    res <- makeME(coords, reps, per.error)
    res$table$Rsq[2]}
    )
  cat("\n\n")
  names(result) <- paste("sim", 1:nsims, sep = ".")
  result
}

> ME.sims <- simulate.makeME (mosquito$wingshape[,, which(mosquito$side == 1)], 
> reps = 2, per.error = 0.05, nsims = 100)
> summary(ME.sims) # just Rsq
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.4264  0.4423  0.4474  0.4476  0.4533  0.4729 

So now I feel really confident that measurement error is probably not a worry, 
based on results from a process that imposes a certain level of measurement 
error.

I might also start to wonder when imposing the randomness starts to approach 
what I see in my empirical example.

> makeME(mosquito$wingshape[,, which(mosquito$side == 1)], per.error = 0.03)

Analysis of Variance, using Residual Randomization
Permutation procedure: Randomization of null model residuals 
Number of permutations: 1000 
Estimation method: Ordinary Least Squares 
Sums of Squares and Cross-products: Type I 
Effect sizes (Z) based on F distributions

          Df      SS       MS     Rsq      F      Z Pr(>F)   
ind       19 0.49153 0.025870 0.62935 1.7873 5.7972  0.001 **
Residuals 20 0.28948 0.014474 0.37065                        
Total     39 0.78101                                         
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Call: procD.lm(f1 = coords ~ ind, data = GPA)

These results mimic my observed empirical results pretty well.  Maybe I can 
infer from this that my digitizing could off by as much as 3% and produce 
results like I observed?

This is a different way of approaching the problem than calculating and trying 
to make sense of statistic that might resemble an effect size, but it feels 
more informative to me.  I am not sure that it is smart to scale the amount of 
variation with centroid size — one might have large and small individuals but 
can zoom in or out to better capture landmark locations — so the function could 
be rewritten to not include centroid size as variable.  This was done so that 
the simulated error was made for digitized specimens, but could be done on 
configurations already constrained to be unit size (after GPA).  I am also not 
sure that it is smart to sample from a normal distribution.  Maybe sampling 
from a uniform distribution would better resemble digitizing shakiness.  I only 
wandered so far into the weeds with this.

I think this might qualify as an additional exploratory approach and agree with 
Philipp that making sense of the magnitude and directions between repeated 
measures, even if only viewed in a PC plot, is rather important.  I’m sure this 
could be improved if someone wants to play more with other data sets.

Cheers!
Mike

> On Nov 4, 2022, at 10:38 AM, [email protected] <[email protected]> 
> wrote:
> 
> Dear all,
> 
> I like to challenge this view on measurement error, as summarized by Andrea, 
> a bit more generally. 
> 
> Clearly, measurement error should be "small," but I disagree that "the idea 
> is that differences among individuals (averaged replicates) in a 
> representative sample should be larger than differences between replicates of 
> the same individual". First, the between-individual variance (or mean sum of 
> squares, MSS) depends on the choice of individuals. For instance, if the 
> sample comprises different species, the MSS between individuals is much 
> larger than for a sample of a single species, and the error MSS in relation 
> to the individual MSS is much smaller in the multi-species sample. Hence, 
> whether or not the error MSS is larger than the between-individual MSS is 
> somewhat arbitrary and of secondary importance anyway. "Controlling for main 
> effects," as suggested by Andrea, is possible but it removes the actual 
> signal against wich I may want to compare the error. In either case, the 
> p-value of the MANOVA is uninformative because the underlying H0 is 
> irrelevant.
> 
> In my opinion, it is more important that the error is unrelated to the signal 
> of interest ("random"), rather than that it is small in terms of some summary 
> statistic. For instance, if in a growth study the measurement error is 
> uncorrelated with the age effects, the error "averages out" (if sample size 
> is large enough) and does not bias the average growth trajectory, even if the 
> error is large. The same applies to group differences. MANOVA does not inform 
> about this independence. Moreover, it pools over all shape coordinates. For 
> instance, it does not inform us if the error is large for shape features of 
> interest (those that differ between groups or correlate with age, etc.) or 
> for shape features of less interest. 
> 
> Note also that most morphometric analyses are based on a few principal 
> components (or similar statistics) of the shape coordinates. PCs are linear 
> combinations, i.e., weighted averages, of the shape coordinates. Hence, group 
> means in a PC plot are averages over all cases AND all variables, so that 
> random error can be expected to be small. Anther issue to consider: If 
> measurement error is indeed approximately isotropic, it has a similar 
> magnitude for all shape features (all directions of shape space). The 
> individual variance, however, typically is much greater for large-scale shape 
> features than for small-scale features, and the relative magnitude of 
> measurement error decreases with increasing spatial scale. PCs typically 
> capture large-scale shape variation, where the relative error is expected to 
> be smaller. The same applies to the symmetric vs. asymmetric components, the 
> latter of which has much smaller individual variance and hence greater 
> relative measurement error.
> 
> The situation is slightly different in studies that compare shape variances, 
> not means, between groups, between symmetric and asymmetric components, or 
> among spatial scales. In contrast to mean estimates, measurement error does 
> not average out for these variance estimates. It is thus important that 
> magnitude and pattern of measurement error are constant (not necessarily 
> small) across groups or components so that observed differences in variance 
> are attributable to biological factors rather than systematic differences in 
> measurement error. Measurement error is most challenging when comparing 
> entire variance-covariance matrices. But again, MANOVA is not the way to 
> assess homogeneity of measurement error across groups.
> 
> If the sample is properly randomized before measurement, it is reasonable to 
> assume that measurement error is approximately uncorrelated with the signal 
> of interest. But there can be exceptions. For instance, younger and smaller 
> individuals can be harder to measure than older and larger individuals. 
> Measurement error can thus correlate with age. I discussed this in 
> Mitteroecker P, Stansfield E (2021) A model of developmental canalization, 
> applied to human cranial form. PLOS Computational Biology 17 (2): e1008381
> 
> Clearly, one can argue that if measurement error is very small, then 
> randomness and homogeneity across groups are less of an issue. But in this 
> case the error really needs to be negligibly small, not just smaller than the 
> individual variation.
> 
> Instead of somewhat meaningless scalar summary statistics (like the F-ratio 
> or some multivariate version of it), I thus prefer an exploratory approach. 
> In the simplest case, a PCA of the data, including the replicated specimens, 
> can show the magnitude and directionality of measurement error in relation to 
> the signal of interest (e.g., group differences, growth trajectories). 
> Measurement error can also be correlated with external variables (e.g., age) 
> or compared among groups, but to my knowledge little work has been done in 
> this direction in geometric morphometrics. An alternative are 
> errors-in-variables models and structural equation models that implement 
> estimates of measurement error in the first place. 
> 
> Best, 
> 
> Philipp M.
> 
> 
> 
> 
> 
> [email protected] <http://gmail.com/> schrieb am Donnerstag, 3. November 2022 
> um 16:36:21 UTC+1:
>> Dear All, 
>> beside the excellent review by Carmelo, I suggest a few other papers 
>> on ME in geometric morphometrics: 
>> Arnqvist, G., Martensson, T. Measurement error in geometric 
>> morphometrics: empirical strategies to assess and reduce its impact on 
>> measures of shape. Acta Zoologica Academiae Scientiarum Hungaricae, 
>> 1998, 44: 73–96. (A bit outdated but still wonderfully accurate in how 
>> they explain different sources of ME). 
>> Klingenberg, C.P., Barluenga, M., Meyer, A. Shape Analysis of 
>> Symmetric Structures: Quantifying Variation Among Individuals and 
>> Asymmetry. Evolution, 2002, 56: 1909–1920. (From where most of us have 
>> borrowed the protocol for assessing ME). 
>> Viscosi, V., Cardini, A. Leaf Morphology, Taxonomy and Geometric 
>> Morphometrics: A Simplified Protocol for Beginners. PLoS ONE, 2011, 6: 
>> e25630. 
>> Galimberti, F., Sanvito, S., Vinesi, M.C., Cardini, A. “Nose-metrics” 
>> of wild southern elephant seal (Mirounga leonina) males using image 
>> analysis and geometric morphometrics. Journal of Zoological 
>> Systematics and Evolutionary Research, 2019, 57: 710–720. 
>> 
>> There's also another one I like, by the Viennese morphometricians (in 
>> a paper on human mandibles, or teeth, symmetric and asymmetric 
>> variation, if I remember well), but I can't find it now. 
>> 
>> 
>> In general, the idea is that differences among individuals (averaged 
>> replicates) in a representative sample should be larger than 
>> differences between replicates of the same individual (the estimate of 
>> ME). This is what is tested by 'individual' in the Procrustes ANOVA in 
>> MorphoJ. It might be important to control for main effects in the 
>> analysis. For instance, by including species and sex before individual 
>> in the hierarchical analysis, I 'statistically remove' (with some 
>> assumptions) the average effect of these factors before comparing 
>> individual variation to ME, which makes the test more conservative (NB 
>> whether this is OK or not it depends on the question one is asking in 
>> her/his study). 
>> For shape data, even if the P value of individual vs residual is 
>> significant, I would not conclude that ME is negligible for sure. I'd 
>> check that the individual Rsq is much larger than the ME (residual) 
>> Rsq and also that shape distances between replicates of the same 
>> individual are smaller than distances among different individuals (if 
>> this is true, replicates should cluster 'within individual' in a UPGMA 
>> phenogram). Then, I feel a bit more confident that ME might be 
>> negligible. 
>> 
>> If ME is large, it may happen that its Rsq is larger than the 
>> individual Rsq (or, which is the same ME SSQ > individual SSQ). For 
>> the F ratio, however, one should look at the mean SSQ, which take df 
>> into account. From the MSSQ, one computes F. 
>> The F ratio in MorphoJ employs an isotropic model but, with large 
>> samples (relative to the number of variables), the software also 
>> provides P values using Pillai, that does not depend (if I recall 
>> well!) on an isotropic model. That N is large and the sample 
>> representative is crucial if one is using a subsample in the 
>> assessment of ME to avoid replicate measurements of all individuals, 
>> which would be better but might take too long if one has hundreds or 
>> thousands individuals. 
>> In R, I generally use adonis that employs an F test (same as in 
>> MorphoJ, for a simple design) but uses permutations instead of 
>> parametric tests. The use of permutations was also suggested as 
>> desirable in Klingenberg et al., 2002. Other packages I suspect might 
>> do something similar, although maybe using different permutational 
>> approaches. I am sure it is explained in their help files. 
>> 
>> Cheers 
>> 
>> Andrea 
>> 
>> On 03/11/2022, ying yi <[email protected] <>> wrote: 
>> > Dear all, 
>> > I used the “procD.lm” function in the geomorph package to test the 
>> > measurement error. I was surprised to find that the within-groups ANOVA 
>> > sum 
>> > 
>> > of squares I got was greater than the among-groups ANOVA sum of squares. I 
>> > 
>> > wonder if something went wrong. What does it mean for “procD.lm” function 
>> > to get an F value <1? 
>> > I would be very happy if someone could help me. 
>> > Yours, 
>> > Sam 
>> > 
>> > References are as follows: 
>> > 
>> > -- 
>> > You received this message because you are subscribed to the Google Groups 
>> > "Morphmet" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to [email protected] <>. 
>> > To view this discussion on the web visit 
>> > https://groups.google.com/d/msgid/morphmet2/06065841-c42e-4a58-a5d3-a96eb3c5787dn%40googlegroups.com.
>> >  
>> > 
>> 
>> 
>> -- 
>> E-mail address: [email protected] <>, [email protected] <> 
>> WEBPAGE: https://sites.google.com/view/alcardini2/ 
>> or https://tinyurl.com/andreacardini 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Morphmet" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/morphmet2/9f7a7818-f6c2-446c-aec8-f66f5f2c730cn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/morphmet2/9f7a7818-f6c2-446c-aec8-f66f5f2c730cn%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/557DA113-DBE4-475C-8939-0F33A22DF894%40gmail.com.

Re: [MORPHMET2] Measurement error in geometric morphometrics

Reply via email to