Thank you Jim and Fred for your stimulating thoughts. Let me try to sketch a pragmatist and interventionist (in the sense of, e.g., Judea Pearl) approach to define measurement error and outline some practical consequences.
1) A measurement is a function of the studied object, the observer, various measurement devices and conditions, and perhaps other, unknown factors. Obviously, the concept of a true value underlying an empirical measurement is problematic because it is per se unobservable and does not even exist independent of the measurement conditions. If the concept of a true value has to be given up, also the concept of a deviation from that true value - the absolute measurement error - fails to be of practical use. 2) In general, we can conceptualize the causal effect of a factor by the effect that an intervention on this factor has. For instance, we can observe the causal effect of sunlight on a plant by growing the plant under different light conditions. Clearly, no plant can grow without any light, and hence the effect of the sun cannot be measured or even thought of as its deviation from the point of complete darkness. Similarly, in classical genetics the effect of a gene can only be described as the average difference between organisms that possess different alleles of this gene. We can only observe the effect of allele substitutions, not the "absolute" effect of a gene or allele (even though this idea is still repeatedly invoked). In the same way, we can think of the inter-observer error as the (average) difference in a measurement when changing the observer. Similarly, the error resulting from a measurement device is the effect that a change of the device would have, etc. In the absence of a better word, let's call this the "attributable" error (or more generally, attributable factors) because we can attribute it empirically or at least conceptually to one or more specific origins. We can then treat and analyze it like any other covariate. Specifically, we can - empirically or theoretically - assess if such a factor is "random" with regard to the signal of interest or if it correlates with the signal, and we may be able to correct for it. 3) Influences on the measurements that we cannot attribute to any origin, not even in theory, because we have no idea what these origins are or because they are entirely stochastic may be called "unattributable" error. There is no way of assessing the origin and directionality of this kind of error because we have no ways of intervening on the underlying factors. We can only repeat the measurements under conditions as constant as possible (same observer, same device, same environment, etc.) and record the deviations among the repeated measurements. Unattributable error may or may not relate to the signal of interest and thus may or may not average out. Either way, we cannot find out. 4) Attributable error, unattributable error, and the average effect of attributable error only partly resemble the concepts of systematic error, random error, and bias. Attributable and unattributable error are not defined ontologically as deviation from the true value but as a function of an intervention. Both may or may not be random, but we can only find out for the attributable error. This has practical implications: The best way to deal with measurement error is to record the potential origin of the attributable error (observer, device, environmental factors, etc.) and include it in the statistical analysis (e.g., the observer as a covariate in a regression analysis). No repeated measurements are necessary to deal with attributable error if sample size is large enough to estimate the various factors in the model. Systematic effects of attributable error may be reducible by an appropriate experimental design (keeping potential factors constant or randomizing them). For studies of mean effects (regression, group mean comparison) it is impossible to estimate if unattributable error has a systematic effect, hence repeated measurements are not of help here either. The only use of repeated measurements is to argue that unattributable error is small enough to be negligible. Unattributable error can be reduced by careful measurements, stable conditions, clear definitions etc, but it can also be reduced by attributing it to an origin. If we record as many potential sources of error (factors of variation) as possible, such as observer, measurement time, place, and conditions, and include them as covariates, we may be able to reduce unattributable error. In summary, my feeling is that repeated measurements are a bit overrated. Careful experimental design and variable selection is more useful than repeated measurements for producing replicable results and for relating these results to (and achieving consilience with) other studies conducted under different conditions. As noted in a previous posting, I think the situation is somewhat different in studies that compare variances among groups and studies that ordinate or classify single individuals, where it might be necessary to estimate or represent both attributable and unattributable error based on repeated measurements. Hope that didn't sound too strange! Best, Philipp Carmelo Fruciano schrieb am Sonntag, 20. November 2022 um 15:09:54 UTC+1: > Well, I did find very interesting and intellectually stimulating the > point you raised when you wrote > "an investigator who systematically measures something a little > differently than intended or at least differently from other > investigators working on the same or similar material. They are > effectively measuring a different variable.". > But I felt there was a somewhat "personal-philosophical" component > attached to it, a potential "can of worms", in fact. That is partly why > I didn't fully engage with it. > > Clearly, the perspective suggested by your text and later on discussed > quite eloquently, is quite distinct from the (usual?) one that a true > value does exist and that one tries to approximate it. The repercussions > of taking one perspective or the other are fairly obvious, as you > correctly point out. To add another example, if one takes the > perspective that true values do exist and two observers measure > systematically different values, then it follows that at least one of > the two observers is biased. > > Personally, while I find the topic intellectually stimulating, I wonder > what are its practical ramifications. > I like your idea of abandoning (at least in many/most cases) the generic > wording "bias". This is because, even following the perspective that > true values do exist, in most cases they cannot be known. Replacing it > with "neutral" language which merely states the facts (e.g., "systematic > difference" as I offered) may be an operational - if philosophically > weak - way to avoid engaging with the "philosophical" issue but > retaining practical/descriptive value (e.g. "there is systematic > difference between my two observers, how - if at all - can I combine > data obtained by them?"). > > I hope the above, while long and involved, clarifies why I suggested > fairly obvious/"weak"/neutral wording. > Best, > Carmelo > > > On 20/11/2022 2:20 pm, [email protected] wrote: > > Yes, but perhaps does not go far enough to reveal the problem. I like > > Fred's point about there being no true value - well, at least not until > > one has a precise definition of what one is digitizing or measuring. I > > once (in the early 1960s) thought mosquito wings were easy material to > > work with because it was so easy to digitize the point at which veins > > intersected or branched. But then when higher resolution was used such > > points became ambiguous. I remember a colleague (I believe in > > Connecticut, sorry but forget his name) who told students to visualize > > the veins as roads and then use as a landmark the location where you > > would imagine a traffice policeman would stand to direct traffic. A rule > > that may have helped repeatability for a person but probably not among > > researchers from countries where they drive on a different side of the > road. > > > > Fred, of course, made one of my points more elegantly. It is not useful > > to talk about the statistical term "bias" if there is no single true > > value being estimated for the organisms being studied. The terms > > "measurement error" or "digitizing error" don't seem to really capture > > the fundamental problem either though they seem good relative to > > particular digitizing or measuring procedures at not too high > > resolution. Reminds me of some descriptions of quantum physics! Perhaps > > we are pushing this point too far? > > > > _F. James Rohlf _ > > Distinguished Professor, Emeritus and Research Professor > > Depts: Anthropology and Ecology & Evolution > > Stony Brook University > >> > >> On 11/20/2022 5:31:00 AM, Carmelo Fruciano <[email protected]> wrote: > >> > >> > >> > >> On 18/11/2022 5:06 pm, 'F. James Rohlf' via Morphmet wrote: > >> > I wonder whether it would help to be more strict about the use of the > >> > word "bias". There is the statistical meaning where there is a problem > >> > with the statistical estimate estimate being used. Must have to treat > >> > and correct for that differently than if the problem is that the > >> > investigator is making the measurements themselves incorrectly. > >> > > >> > With a statistic one can investigate properties assuming various > >> > statistical distributions. Not sure how to investigate theoretically > >> the > >> > effect of an investigator who systematically measures something a > >> little > >> > differently than intended or at least differently from other > >> > investigators working on the same or similar material. They are > >> > effectively measuring a different variable. Suggestions for a > >> different > >> > word? > >> > >> Hi Jim and all, > >> I've been following the discussion and several interesting points which > >> have been raised this far. > >> About wording, in my mind, "systematic differences" is probably a quite > >> "neutral" (and current) wording to describe differences between > >> operators, devices, or other source which produces a variation in > >> multivariate mean. As others have suggested, depending on the context > >> other, less neutral, wording may also be appropriate. > >> Best, > >> Carmelo > >> > >> > >> -- > >> ================== > >> Carmelo Fruciano > >> Italian National Research Council (CNR) > >> IRBIM Messina > >> http://www.fruciano.org/ > >> ================== > >> > >> > >> > -------- Original message -------- > >> > From: Mike Collyer > >> > Date: 11/8/22 1:16 PM (GMT-05:00) > >> > To: andrea cardini > >> > Cc: [email protected] > >> > Subject: Re: [MORPHMET2] Measurement error in geometric morphometrics > >> > > >> > Dear Andrea, > >> > > >> > I have to argue against one of your points. > >> >> > >> >> Nevertheless, I could miss a bias, but if ME has an Rsq of, say, less > >> >> than 1/30 of individual variation within species, when I test species > >> >> the bias will be negligible. This is, if I am correct, what you > >> >> implied when wrote that "one can argue that if measurement error is > >> >> very small, then randomness and homogeneity across groups are less of > >> >> an issue”. > >> > > >> > If we come full-circle to Philipp’s first point — that choice of > >> > individuals can mislead one’s interpretation — I believe it is > >> > dangerous to use a value of Rsq to conclude systematic ME (bias) is > >> > negligible. I hope I can demonstrate this with an example (in R). > >> > > >> > To set this up, I create 10 shapes based on a template that is a > >> square. > >> > I then add a digitizing bias by shifting two of the four landmarks > >> > (plus some random error). > >> > > >> > > # Create 10 specimens > >> > > > >> > > coords1 <- lapply(1:10, function(.) mat + rnorm(8, sd = 1)) > >> > > > >> > > # Add digitizing bias for each, shifting two landmarks a little > right > >> > > # plus add a little random error > >> > > > >> > > coords2 <- lapply(coords1, function(x) > >> > + x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + > >> > rnorm(8, sd = 0.1)) > >> > > > >> > > # string together and test for ME > >> > > > >> > > lmks <- simplify2array(c(coords1, coords2)) > >> > > GPA <- gpagen(lmks, print.progress = FALSE) > >> > > ind <- factor(c(rep(1:10, 2))) > >> > > summary(procD.lm(coords ~ ind, data = GPA)) > >> > > >> > Analysis of Variance, using Residual Randomization > >> > Permutation procedure: Randomization of null model residuals > >> > Number of permutations: 1000 > >> > Estimation method: Ordinary Least Squares > >> > Sums of Squares and Cross-products: Type I > >> > Effect sizes (Z) based on F distributions > >> > > >> > Df SS MS Rsq F Z Pr(>F) > >> > ind 9 1.54733 0.171926 0.94906 20.7 5.5944 0.001 ** > >> > Residuals 10 0.08306 0.008306 0.05094 > >> > Total 19 1.63039 > >> > --- > >> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > >> > > >> > Call: procD.lm(f1 = coords ~ ind, data = GPA) > >> > > >> > > >> > > >> > If we plot PC scores, the systematic bias is obvious: > >> > > >> > > >> > > >> > > # plot PC scores, with lines showing systematic ME > >> > > > >> > > PCA <- gm.prcomp(GPA$coords) > >> > > plot(PCA, pch = 19, asp = 1, col = rep(1:2, each = 10)) > >> > > > >> > > for(i in 1:10) { > >> > + points(rbind(PCA$x[i,], PCA$x[10 + i,]), > >> > + type = "l", > >> > + lty = 3) > >> > + } > >> > > >> > PastedGraphic-1.tiff > >> > > >> > So one might see the bias in the plot and the 5% ME — if we want to > >> call > >> > it that based on Rsq in the ANOVA — might be too high for one’s > >> comfort. > >> > But now let's repeat the process on 10 specimens using instead of a > >> > square template, a long rectangle. > >> > > >> > > >> > > # Now add some more individuals to the mix, perhaps from > >> > > # a much differently shaped species (long rectangle, not square) > >> > > # using the same strategy > >> > > > >> > > mat3 <- matrix(c(0, 0, 50, 0, 0, 5, 50, 5), 4, 2, byrow = T) > >> > > coords3 <- lapply(1:10, function(.) mat3 + rnorm(8, sd = 1)) > >> > > coords4 <- lapply(coords3, function(x) > >> > + x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + > >> > rnorm(8, sd = 0.1)) > >> > > > >> > > > >> > > lmks <- simplify2array(c(coords1, coords2, coords3, coords4)) > >> > > GPA <- gpagen(lmks, print.progress = FALSE) > >> > > ind <- factor(c(rep(1:10, 2), rep(11:20, 2))) > >> > > summary(procD.lm(coords ~ ind, data = GPA)) > >> > > >> > Analysis of Variance, using Residual Randomization > >> > Permutation procedure: Randomization of null model residuals > >> > Number of permutations: 1000 > >> > Estimation method: Ordinary Least Squares > >> > Sums of Squares and Cross-products: Type I > >> > Effect sizes (Z) based on F distributions > >> > > >> > Df SS MS Rsq F Z Pr(>F) > >> > ind 19 4.9087 0.258351 0.98567 72.39 8.8918 0.001 ** > >> > Residuals 20 0.0714 0.003569 0.01433 > >> > Total 39 4.9801 > >> > --- > >> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > >> > > >> > Call: procD.lm(f1 = coords ~ ind, data = GPA) > >> > > > >> > > > >> > > PCA <- gm.prcomp(GPA$coords) > >> > > P <- plot(PCA, pch = c(rep(19, 20), rep(20, 20)), asp = 1, col = > >> > rep(rep(1:2, each = 10), 2)) > >> > > > >> > > for(i in 1:10) { > >> > + points(rbind(PCA$x[i,], PCA$x[10 + i,]), > >> > + type = "l", > >> > + lty = 3) > >> > + } > >> > > >> > PastedGraphic-2.tiff > >> > > >> > > >> > Note that the corresponding 10 vectors are shown in this PC plot as in > >> > the first, but 20 more values have been added (the cluster of points > to > >> > the right). The mean is no longer the mean of 20 square-like shapes, > >> > but is the mean of 40 rectangles, with the square-like shapes now > >> having > >> > negative PC scores in the plot. Square shapes and long rectangle > >> shapes > >> > are clearly separated in this plot. Here is a transformation grid > >> > (scaled 1x) for the approximate middle of the points on the left: > >> > > >> > PastedGraphic-3.png > >> > > >> > and the same for the cluster of points on the right: > >> > > >> > PastedGraphic-4.png > >> > > >> > But let’s pay attention to the same 20 configurations in both plots. > >> > Now the systematic ME is clearly associated with the first PC, which > >> > is also representing more of the overall shape variation, and the > >> signal > >> > remains even though the ANOVA results suggest this is no big deal > >> (1.4 % > >> > of variation). Worse, the bias now appears to be associated with, > >> e.g., > >> > species differences. > >> > > >> > The bias in this example did not become negligible in spite of > changing > >> > the sample, and in spite of a conclusion to the contrary that might be > >> > made with ANOVA results. Again, evaluating the relative portion of > >> > variance explained (especially if based on dispersion of points, > alone) > >> > is dangerous, and a comforting statistic should not be sufficient > >> > evidence to not worry about a systematic measurement error. > >> > > >> > Best, > >> > Mike > >> > > >> > > >> > -- > >> > You received this message because you are subscribed to the Google > >> > Groups "Morphmet" group. > >> > To unsubscribe from this group and stop receiving emails from it, send > >> > an email to [email protected] > >> > . > >> > To view this discussion on the web visit > >> > > >> > https://groups.google.com/d/msgid/morphmet2/C30FAD86-E64E-4AEB-8B8C-041768B131D8%40gmail.com > > >> > >> > . > >> > > >> > -- > >> > You received this message because you are subscribed to the Google > >> > Groups "Morphmet" group. > >> > To unsubscribe from this group and stop receiving emails from it, send > >> > an email to [email protected] > >> > . > >> > To view this discussion on the web visit > >> > > >> > https://groups.google.com/d/msgid/morphmet2/6377ada3.050a0220.36294.e302%40mx.google.com > > >> > >> > . > >> > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "Morphmet" group. > >> To unsubscribe from this group and stop receiving emails from it, send > >> an email to [email protected]. > >> To view this discussion on the web visit > >> > https://groups.google.com/d/msgid/morphmet2/85163e15-ee5a-9372-d6ae-315931bfd411%40unict.it. > > > >> > > -- > ================== > Carmelo Fruciano > Italian National Research Council (CNR) > IRBIM Messina > http://www.fruciano.org/ > ================== > -- You received this message because you are subscribed to the Google Groups "Morphmet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/morphmet2/f1fe9a80-a708-4000-b12f-44de9502fcabn%40googlegroups.com.
