Related our study about Landmark reliability
ERCAN I, Ocakoglu G, Guney I, Yazici B. Adaptation of Generalizability Theory for Inter-Rater Reliability for Landmark Localization International Journal of Tomography & Statistics, 2008 Jun, Vol. 9, No. S08:51-58 You can compute from below link http://www20.uludag.edu.tr/~biostat/landmark_reliability/G_coefficient.html best regards I Ercan Prof. Dr. Ilker ERCAN http://biyoistatistik.uludag.edu.tr/ercani.htm Uludag University Medical Faculty Department of Biostatistics Bursa/TÜRKİYE ________________________________ Gönderen: alcardini <[email protected]> adına [email protected] <[email protected]> Gönderildi: 10 Kasım 2022 Perşembe 10:33 Kime: morphmet2 <[email protected]> Konu: RE: [MORPHMET2] Measurement error in geometric morphometrics This is just to be clear that 1/30 is no rule of thumb. I took it, as an example, from the data I am analysing right now. Worrying about biases is important. That's why I worry much about the bias introduced by Procrustes and sliding when people subsets landmarks (so called within a configuration methods) or, even worse, does per-landmark analysis. Different source (not ME) but a problem that is there all the time in those analyses. Yet, they remain popular. Sorry, Mike, if you felt offended by my answer. Have a nice day. Cheers Andrea On Wed, 9 Nov 2022 at 21:19, Mike Collyer <[email protected]> wrote: > Well, Andrea, it appears that your empirical optimism trumps my > theoretical cynicism! I probably could have chosen better shapes — wanted > a simple example that seemed to comply with your 1/30th Rsq rule of thumb — > used different sample sizes, had more “species” than a square and > rectangle, made sure the Euclidean distance in tangent space tracked > Procrustes distances better, assured homogeneity of variance and if all > that, and only after that, illustrated that systematic measurement error > persisted even if it was apparently subsumed by a small Rsq in ANOVA. (But > the Residual Rsq might be higher, which I assume after you proposed 1/30th > of the individual Rsq would not be consistent with the level of shape > variation you felt was warranted. Hence the extreme simulation. I did use > smaller disparity in shape and the pattern holds.) The point was not to > find an infallible example but to show that (1) systematic measurement > error, even if apparently small can still be a problem, and (2) the > systematic bias can align with other signals, something that would not be > picked up in an ANOVA table. > > I’ll offer an additional example based on real experience in my lab, as > something I hope is a bit of allegory (although I sure don’t try to > persuade you, Andrea — others might be interested). I once had a cadre of > students and we needed to photograph and digitize thousands of fish > specimens from museum collections. Before embarking, I had students > digitize the same set of small, minnow-like fish, combined the landmark > data, and looked for systematic biases in digitization style via GPA then > PC plots. Sure enough, students tended to have replicated shifts in points > in the plot, due to slight variations in style. (It would not have been > easy to wait until we had thousands of photographs taken and then randomize > images to attempt to force the systematic bias to behave more like random > error, although randomization like this would be ideal. Based on schedules > and the need for some of the same people to photograph and digitize, the > work had to be more processed by batches.) We identified tendencies and > worked with students until the measurement error looked more random. That > was comforting, but because the individuals were all small fish from just a > couple of species, the Rsq remained pretty high. > > If, as an alternative, we had been performing a larger macroevolutionary > study and included in our systematic bias experiment fish of vastly > different shapes, maybe ME would be so small that based on your argument, > we shrug our shoulders and move on. But now if student A digitized several > species that were actually similar to species student B digitized, in the > same clade, even if the shape variation among their combined species was > small compared to the larger sample that comprised many different species > and different clades, is this okay? If we wanted to perhaps measure > evolutionary rates would it not be a problem that systematic biases only > affected a specific clade of similarly shaped fishes? I would argue that > we could potentially under- or overestimate evolutionary rates, simply > because in the pilot test we based our evaluation on an Rsq value that > obscured the systematic ME. > > I know you would probably be cognizant of such things and use different > samples to align with your focus, but I think it is generally more > important that we do not lose the theoretical forest for the empirical > trees. Acknowledging a systematic bias and ignoring it is one thing. Not > recognizing it is another. Failing to consider analyses that might help > one to understand if ME has a systematic signal (rather than just not care > if the signal is systematic or random) would be unfortunate. So if > somebody has an empirical data set that not only has minnows but maybe also > some puffers, sharks, eels, and ocean sunfish, I sure hope they would not > scoff at ME because the Rsq for repeated measures is small. Sure, the > diagnostic steps (plus others) that you performed should be done but not > doing those things in the example I provided does not create suspicion that > a correlated shifts in position in the PC plot are spurious; they reflected > what was simulated. > > By the way, the heterogeneity in variance between squares and rectangles > is from scaling to unit size configurations that were much different size > but simulated with the same level of sd at the points. If I had not been > working quick, I might have thought about that and made templates more > similar in size or varied the simulation sd in proportion to the object > size. I hope you are not implying that by not doing that, the simulated > bias is invalidated. > > Cheers! > Mike > > On Nov 9, 2022, at 1:27 PM, alcardini <[email protected]> wrote: > > Dear Mike, > thanks for the interesting example. > My answers: > > 1) Before worrying about ME, if I had those data, I'd worry about the poor > tangent space approximation. > <image.png> > > 2) I'd also worry that, for comparing groups (squares vs rectangles), all > the tests I use require homogeneity of variance and covariance. Variance > (trace of the matrix) in the squares is ca 50 times larger than in the > rectangles (when I re-run you script). One sees that also in the PC1-PC2 > scatterplot, if you color the groups: > <image.png> > 3) I would worry much less about the bias in this specific example. If the > hp I am testing is group differences, yes, the bias slightly inflates the > differences. Does that change my conclusions? I'd say no, because > differences are so huge that groups are perfectly separated regardless of > the bias > <image.png> > and in the visualization the mean of squares looks like ca. a square: > <image.png> > and the mean of rectangles looks like a very elongated rectangle: > <image.png> > as in the model that generated the data. > > > It is a nice example. I would not be happy about the bias. But where you > see the glass half empty, I see it almost completely full. > The issue of biases in ME is important. I do look forward to reading > papers that develop and detail methods to assess them (including how biases > affect the assumptions of the models, not just the specific hypothesis > being tested). > > For now, my main concern remain whether one has a flaw in the experimental > design and a relevant source of ME gets undetected. Probably we should > spend more time on this and develop checklists and protocols that help in > the most common cases. > Cheers > > Andrea > > > > On Tue, 8 Nov 2022 at 19:16, Mike Collyer <[email protected]> wrote: > >> Dear Andrea, >> >> I have to argue against one of your points. >> >> >> Nevertheless, I could miss a bias, but if ME has an Rsq of, say, less >> than 1/30 of individual variation within species, when I test species the >> bias will be negligible. This is, if I am correct, what you implied when >> wrote that "one can argue that if measurement error is very small, then >> randomness and homogeneity across groups are less of an issue”. >> >> >> If we come full-circle to Philipp’s first point — that choice of >> individuals can mislead one’s interpretation — I believe it is dangerous >> to use a value of Rsq to conclude systematic ME (bias) is negligible. I >> hope I can demonstrate this with an example (in R). >> >> To set this up, I create 10 shapes based on a template that is a square. >> I then add a digitizing bias by shifting two of the four landmarks (plus >> some random error). >> >> > # Create 10 specimens >> > >> > coords1 <- lapply(1:10, function(.) mat + rnorm(8, sd = 1)) >> > >> > # Add digitizing bias for each, shifting two landmarks a little right >> > # plus add a little random error >> > >> > coords2 <- lapply(coords1, function(x) >> + x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + rnorm(8, >> sd = 0.1)) >> > >> > # string together and test for ME >> > >> > lmks <- simplify2array(c(coords1, coords2)) >> > GPA <- gpagen(lmks, print.progress = FALSE) >> > ind <- factor(c(rep(1:10, 2))) >> > summary(procD.lm(coords ~ ind, data = GPA)) >> >> Analysis of Variance, using Residual Randomization >> Permutation procedure: Randomization of null model residuals >> Number of permutations: 1000 >> Estimation method: Ordinary Least Squares >> Sums of Squares and Cross-products: Type I >> Effect sizes (Z) based on F distributions >> >> Df SS MS Rsq F Z Pr(>F) >> ind 9 1.54733 0.171926 0.94906 20.7 5.5944 0.001 ** >> Residuals 10 0.08306 0.008306 0.05094 >> Total 19 1.63039 >> --- >> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> >> Call: procD.lm(f1 = coords ~ ind, data = GPA) >> >> >> >> If we plot PC scores, the systematic bias is obvious: >> >> >> >> > # plot PC scores, with lines showing systematic ME >> > >> > PCA <- gm.prcomp(GPA$coords) >> > plot(PCA, pch = 19, asp = 1, col = rep(1:2, each = 10)) >> > >> > for(i in 1:10) { >> + points(rbind(PCA$x[i,], PCA$x[10 + i,]), >> + type = "l", >> + lty = 3) >> + } >> >> <PastedGraphic-1.tiff> >> >> So one might see the bias in the plot and the 5% ME — if we want to call >> it that based on Rsq in the ANOVA — might be too high for one’s comfort. >> But now let's repeat the process on 10 specimens using instead of a square >> template, a long rectangle. >> >> >> > # Now add some more individuals to the mix, perhaps from >> > # a much differently shaped species (long rectangle, not square) >> > # using the same strategy >> > >> > mat3 <- matrix(c(0, 0, 50, 0, 0, 5, 50, 5), 4, 2, byrow = T) >> > coords3 <- lapply(1:10, function(.) mat3 + rnorm(8, sd = 1)) >> > coords4 <- lapply(coords3, function(x) >> + x + matrix(c(0, 0, 1.5, 0, 0, 0, 1.5, 0), 4, 2, byrow = T) + rnorm(8, >> sd = 0.1)) >> > >> > >> > lmks <- simplify2array(c(coords1, coords2, coords3, coords4)) >> > GPA <- gpagen(lmks, print.progress = FALSE) >> > ind <- factor(c(rep(1:10, 2), rep(11:20, 2))) >> > summary(procD.lm(coords ~ ind, data = GPA)) >> >> Analysis of Variance, using Residual Randomization >> Permutation procedure: Randomization of null model residuals >> Number of permutations: 1000 >> Estimation method: Ordinary Least Squares >> Sums of Squares and Cross-products: Type I >> Effect sizes (Z) based on F distributions >> >> Df SS MS Rsq F Z Pr(>F) >> ind 19 4.9087 0.258351 0.98567 72.39 8.8918 0.001 ** >> Residuals 20 0.0714 0.003569 0.01433 >> Total 39 4.9801 >> --- >> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> >> Call: procD.lm(f1 = coords ~ ind, data = GPA) >> > >> > >> > PCA <- gm.prcomp(GPA$coords) >> > P <- plot(PCA, pch = c(rep(19, 20), rep(20, 20)), asp = 1, col = >> rep(rep(1:2, each = 10), 2)) >> > >> > for(i in 1:10) { >> + points(rbind(PCA$x[i,], PCA$x[10 + i,]), >> + type = "l", >> + lty = 3) >> + } >> >> <PastedGraphic-2.tiff> >> >> >> Note that the corresponding 10 vectors are shown in this PC plot as in >> the first, but 20 more values have been added (the cluster of points to the >> right). The mean is no longer the mean of 20 square-like shapes, but is >> the mean of 40 rectangles, with the square-like shapes now having negative >> PC scores in the plot. Square shapes and long rectangle shapes are clearly >> separated in this plot. Here is a transformation grid (scaled 1x) for the >> approximate middle of the points on the left: >> >> <PastedGraphic-3.png> >> >> and the same for the cluster of points on the right: >> >> <PastedGraphic-4.png> >> >> But let’s pay attention to the same 20 configurations in both plots. Now >> the systematic ME is clearly associated with the first PC, which is also >> representing more of the overall shape variation, and the signal remains >> even though the ANOVA results suggest this is no big deal (1.4 % of >> variation). Worse, the bias now appears to be associated with, e.g., >> species differences. >> >> The bias in this example did not become negligible in spite of changing >> the sample, and in spite of a conclusion to the contrary that might be made >> with ANOVA results. Again, evaluating the relative portion of variance >> explained (especially if based on dispersion of points, alone) is >> dangerous, and a comforting statistic should not be sufficient evidence to >> not worry about a systematic measurement error. >> >> Best, >> Mike >> >> >> > > -- > E-mail address: [email protected], [email protected] > WEBPAGE: https://sites.google.com/view/alcardini2/ > or https://tinyurl.com/andreacardini > > -- > You received this message because you are subscribed to the Google Groups > "Morphmet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/morphmet2/CAJ__j7PtwTnGWOXkn1naJ%2B6UEt8mcQ-cTXG42qtsqCP%3DUqADOQ%40mail.gmail.com > <https://groups.google.com/d/msgid/morphmet2/CAJ__j7PtwTnGWOXkn1naJ%2B6UEt8mcQ-cTXG42qtsqCP%3DUqADOQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > -- E-mail address: [email protected], [email protected] WEBPAGE: https://sites.google.com/view/alcardini2/ or https://tinyurl.com/andreacardini -- E-mail address: [email protected], [email protected] WEBPAGE: https://sites.google.com/view/alcardini2/ or https://tinyurl.com/andreacardini -- You received this message because you are subscribed to the Google Groups "Morphmet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/morphmet2/CAJ__j7NQxJQBWD1dzZPEzyDajEZvEcAwqe%3DFCYZKJ%3DWtqU8iHw%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "Morphmet" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/morphmet2/AM9P193MB101691B69AC8287C762C26F2D7019%40AM9P193MB1016.EURP193.PROD.OUTLOOK.COM.
