I generally agree with Philipp, that provided a much deeper discussion on the issue of ME.
In fact, a careful reading shows that:
1) I am not advising to always control for main effects (precisely for the reasons he mentioned):
"I 'statistically remove' (with some
assumptions) the average effect of these factors before comparing
individual variation to ME, which makes the test more conservative (NB
whether this is OK or not it depends on the question one is asking in
her/his study)."
2) that did not suggest to rely much on P values:
"even if the P value of individual vs residual is
>> significant, I would not conclude that ME is negligible for sure. I'd
>> check that the individual Rsq is much larger than the ME".
3) I also like exploratory approaches: phenograms but also PCAs.


I really like the idea of testing that the error is uncorrelated with the factors I am studying, which for me is the most important point Philipp made (and I missed). If a study is asking a very specific question (say, taxonomic differences), that might be easier than if there are multiple factors (sex, age, allometry, covariates etc.). I'd definitely like to see examples.

Philipp said "If > measurement error is indeed approximately isotropic, it has a similar > magnitude for all shape features (all directions of shape space)". But can ME be isotropic after the superimposition? This, and the comment on ME biasing results, made me think about the spurious results of analyses of modularity/integration using the 'within a configuration approach' (my 2019 Evol. Biol. paper). That might be interpreted as an example where random noise (I could have simulated random digitization error) gets structured by the superimposition and sliding and this drives the spurious outcome of the test. It would be nice to demonstrate, as suggested by Philipp in general for ME, that this structured error is negligible relative to the 'true' covariance of the data. That was the argument of those dismissing the problem but I've not yet seen a demonstration in the literature except, maybe, in ad hoc examples that have no external validity. I might have missed it, however.

I am much less convinced that using only the first PCs of shape coordinates provides a justification for ME being negligible: "Hence, > group means in a PC plot are averages over all cases AND all variables, so > that random error can be expected to be small." Even assuming that I convincingly demonstrate that removing the last PCs does not remove important information and even if I am consistently using always the same set of first PCs in all analyses, I've seen more than one case where ME is the likely driver of most variation on PC1: 1) Euryon is difficult to digitize with precision on human crania. I've seen datasets where PC1 was mostly variation in its position, despite being just one point in a large configuration. 2) Similar situation with a couple of landmarks on the ventral side of the ramus in marmot mandibles, which I tried and then excluded. In this case, I know they have the largest absolute digitization error. 3) A large anthropological dataset where data collection was done in two chunks with a few years gap between the first and second one: PC1 clearly separated the two chunks of data, which I suspect was due to a small but consistent systematic error (either the instrument or the operator digitizing the landmarks in a slightly but consistently different position).


I am glad that ME is discussed in morphomet. I find it crucial and yet most GMM papers I read do not even mention ME.
Thanks Philipp and all other contributors to the discussion.

Cheers

Andrea







On 04/11/2022 15:38, [email protected] wrote:


Dear all,

I like to challenge this view on measurement error, as summarized by
Andrea, a bit more generally.

Clearly, measurement error should be "small," but I disagree that "the idea
is that differences among individuals (averaged replicates) in a
representative sample should be larger than differences between replicates
of the same individual". First, the between-individual variance (or mean
sum of squares, MSS) depends on the choice of individuals. For instance, if
the sample comprises different species, the MSS between individuals is much
larger than for a sample of a single species, and the error MSS in relation
to the individual MSS is much smaller in the multi-species sample. Hence,
whether or not the error MSS is larger than the between-individual MSS is
somewhat arbitrary and of secondary importance anyway. "Controlling for
main effects," as suggested by Andrea, is possible but it removes the
actual signal against wich I may want to compare the error. In either case,
the *p*-value of the MANOVA is uninformative because the underlying H0 is
irrelevant.

In my opinion, it is more important that the error is unrelated to the
signal of interest ("random"), rather than that it is small in terms of
some summary statistic. For instance, if in a growth study the measurement
error is uncorrelated with the age effects, the error "averages out" (if
sample size is large enough) and does not bias the average growth
trajectory, even if the error is large. The same applies to group
differences. MANOVA does not inform about this independence. Moreover, it
pools over all shape coordinates. For instance, it does not inform us if
the error is large for shape features of interest (those that differ
between groups or correlate with age, etc.) or for shape features of less
interest.

Note also that most morphometric analyses are based on a few principal
components (or similar statistics) of the shape coordinates. PCs are linear
combinations, i.e., weighted averages, of the shape coordinates. Hence,
group means in a PC plot are averages over all cases AND all variables, so
that random error can be expected to be small. Anther issue to consider: If
measurement error is indeed approximately isotropic, it has a similar
magnitude for all shape features (all directions of shape space). The
individual variance, however, typically is much greater for large-scale
shape features than for small-scale features, and the relative magnitude of
measurement error decreases with increasing spatial scale. PCs typically
capture large-scale shape variation, where the relative error is expected
to be smaller. The same applies to the symmetric vs. asymmetric components,
the latter of which has much smaller individual variance and hence greater
relative measurement error.

The situation is slightly different in studies that compare shape
variances, not means, between groups, between symmetric and asymmetric
components, or among spatial scales. In contrast to mean estimates,
measurement error does not average out for these variance estimates. It is
thus important that magnitude and pattern of measurement error are constant
(not necessarily small) across groups or components so that observed
differences in variance are attributable to biological factors rather than
systematic differences in measurement error. Measurement error is most
challenging when comparing entire variance-covariance matrices. But again,
MANOVA is not the way to assess homogeneity of measurement error across
groups.

If the sample is properly randomized before measurement, it is reasonable
to assume that measurement error is approximately uncorrelated with the
signal of interest. But there can be exceptions. For instance, younger and
smaller individuals can be harder to measure than older and larger
individuals. Measurement error can thus correlate with age. I discussed
this in Mitteroecker P, Stansfield E (2021) A model of developmental
canalization, applied to human cranial form. PLOS Computational Biology 17
(2): e1008381

Clearly, one can argue that if measurement error is very small, then
randomness and homogeneity across groups are less of an issue. But in this
case the error really needs to be negligibly small, not just smaller than
the individual variation.

Instead of somewhat meaningless scalar summary statistics (like the *F*-ratio
or some multivariate version of it), I thus prefer an exploratory approach.
In the simplest case, a PCA of the data, including the replicated
specimens, can show the magnitude and directionality of measurement error
in relation to the signal of interest (e.g., group differences, growth
trajectories). Measurement error can also be correlated with external
variables (e.g., age) or compared among groups, but to my knowledge little
work has been done in this direction in geometric morphometrics. An
alternative are errors-in-variables models and structural equation models
that implement estimates of measurement error in the first place.

Best,

Philipp M.




[email protected] schrieb am Donnerstag, 3. November 2022 um 16:36:21 UTC+1:

Dear All,
beside the excellent review by Carmelo, I suggest a few other papers
on ME in geometric morphometrics:
Arnqvist, G., Martensson, T. Measurement error in geometric
morphometrics: empirical strategies to assess and reduce its impact on
measures of shape. Acta Zoologica Academiae Scientiarum Hungaricae,
1998, 44: 73–96. (A bit outdated but still wonderfully accurate in how
they explain different sources of ME).
Klingenberg, C.P., Barluenga, M., Meyer, A. Shape Analysis of
Symmetric Structures: Quantifying Variation Among Individuals and
Asymmetry. Evolution, 2002, 56: 1909–1920. (From where most of us have
borrowed the protocol for assessing ME).
Viscosi, V., Cardini, A. Leaf Morphology, Taxonomy and Geometric
Morphometrics: A Simplified Protocol for Beginners. PLoS ONE, 2011, 6:
e25630.
Galimberti, F., Sanvito, S., Vinesi, M.C., Cardini, A. “Nose-metrics”
of wild southern elephant seal (Mirounga leonina) males using image
analysis and geometric morphometrics. Journal of Zoological
Systematics and Evolutionary Research, 2019, 57: 710–720.

There's also another one I like, by the Viennese morphometricians (in
a paper on human mandibles, or teeth, symmetric and asymmetric
variation, if I remember well), but I can't find it now.


In general, the idea is that differences among individuals (averaged
replicates) in a representative sample should be larger than
differences between replicates of the same individual (the estimate of
ME). This is what is tested by 'individual' in the Procrustes ANOVA in
MorphoJ. It might be important to control for main effects in the
analysis. For instance, by including species and sex before individual
in the hierarchical analysis, I 'statistically remove' (with some
assumptions) the average effect of these factors before comparing
individual variation to ME, which makes the test more conservative (NB
whether this is OK or not it depends on the question one is asking in
her/his study).
For shape data, even if the P value of individual vs residual is
significant, I would not conclude that ME is negligible for sure. I'd
check that the individual Rsq is much larger than the ME (residual)
Rsq and also that shape distances between replicates of the same
individual are smaller than distances among different individuals (if
this is true, replicates should cluster 'within individual' in a UPGMA
phenogram). Then, I feel a bit more confident that ME might be
negligible.

If ME is large, it may happen that its Rsq is larger than the
individual Rsq (or, which is the same ME SSQ > individual SSQ). For
the F ratio, however, one should look at the mean SSQ, which take df
into account. From the MSSQ, one computes F.
The F ratio in MorphoJ employs an isotropic model but, with large
samples (relative to the number of variables), the software also
provides P values using Pillai, that does not depend (if I recall
well!) on an isotropic model. That N is large and the sample
representative is crucial if one is using a subsample in the
assessment of ME to avoid replicate measurements of all individuals,
which would be better but might take too long if one has hundreds or
thousands individuals.
In R, I generally use adonis that employs an F test (same as in
MorphoJ, for a simple design) but uses permutations instead of
parametric tests. The use of permutations was also suggested as
desirable in Klingenberg et al., 2002. Other packages I suspect might
do something similar, although maybe using different permutational
approaches. I am sure it is explained in their help files.

Cheers

Andrea

On 03/11/2022, ying yi <[email protected]> wrote:
Dear all,
I used the “procD.lm” function in the geomorph package to test the
measurement error. I was surprised to find that the within-groups ANOVA
sum

of squares I got was greater than the among-groups ANOVA sum of squares.
I

wonder if something went wrong. What does it mean for “procD.lm” function
to get an F value <1?
I would be very happy if someone could help me.
Yours,
Sam

References are as follows:

--
You received this message because you are subscribed to the Google Groups
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
To view this discussion on the web visit

https://groups.google.com/d/msgid/morphmet2/06065841-c42e-4a58-a5d3-a96eb3c5787dn%40googlegroups.com
.



--
E-mail address: [email protected], [email protected]
WEBPAGE: https://sites.google.com/view/alcardini2/
or https://tinyurl.com/andreacardini



--
Dr. Andrea Cardini
Researcher, Dipartimento di Scienze Chimiche e Geologiche, Università di Modena e Reggio Emilia, Via Campi, 103 - 41125 Modena - Italy
tel. 0039 059 4223140

Adjunct Associate Professor, Centre for Forensic Anthropology, The University of Western Australia, 35 Stirling Highway, Crawley WA 6009, Australia

E-mail address: [email protected], [email protected]
WEBPAGE: https://sites.google.com/view/alcardini2/
or https://tinyurl.com/andreacardini

--
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/ed87309f-284c-b968-bde8-804ae5b6eb6a%40gmail.com.

Reply via email to