----- Forwarded message from Douglas Theobald
<dtheob...@brandeis.edu> -----
Date: Sun, 1 Dec 2013 10:06:20
-0500
From: Douglas Theobald <dtheob...@brandeis.edu>
Reply-To: Douglas Theobald <dtheob...@brandeis.edu>
Subject: Re:
missing structures
To: morphmet@morphometrics.org
The "missing data" problem is much
misunderstood, and the technical
senses of "missing data" and "missing
at random" do not correspond to
everyday, intuitive usage.
In fact, Patrick's problem can be validly discussed as a
"missing
data" problem, in the statistical sense of the term. For
example, the
Expectation-Maximization algorithm is often characterized
as dealing
with "missing data", and the EM algorithm can deal elegantly with
data
that exist but were not measured, which is the more
intuitive,
non-technical sense of "missing data". However, the EM
algorithm can
also deal with data that are unobserved because they do not
exist
(this is the "qualitative" difference Philipp mentions). In
this case
you can view the EM algorithm as a mathematical trick that
gets the
right answer by pretending that the data are missing in the
common
sense way.
For the EM algorithm
(and many other statistical data imputation
methods) to be valid, the
data must be "missing at random" (MAR), as
Philipp says. But the technical definition of MAR does not
correspond
to the intuitive sense of "random". Missing morphological
data often
is MAR -- for instance, Patrick's "missing data", in which
certain
landmarks are unobserved because they never develop in the
first
place, are MAR and hence can be validly treated by the EM
algorithm.
MAR only requires that the probability of a data point
being absent
depends only on observed data. Clearly, in Patrick's case we
can
determine whether a developmental landmark will be absent based
only
on observed data (e.g., if we know the organism the data is
from).
> On Sun,
Dec 1, 2013 at 3:33 AM, <morphmet_modera...@morphometrics.org> wrote:
>
> ----- Forwarded message from Philipp Mitteröcker <mitte...@univie.ac.at>
-----
>
> Date: Thu, 28 Nov 2013 11:26:34
-0500
> From: Philipp Mitteröcker <mitte...@univie.ac.at>
> Reply-To:
Philipp Mitteröcker <mitte...@univie.ac.at>
> Subject: Re: missing structures
>
> The problem raised by Patrick is
not really a missing data problem.
> Missing data, in the technical sense, are structures or
properties
> that do exist in the specimens but could not have been
measured.
> Hence it can make some sense to estimate them. But when
structures
> simply do not exist in some specimens, what does it mean to
estimate
> them?
>
> In other words, if
a structure is present in one group and absent in
> another group,
these groups differ not only quantitatively but also
> qualitatively. Estimating the values, or letting landmarks
overlap,
> means that a qualitative difference is -- arbitrarily --
substituted
> by a quantitative one. Many statistical results will
be affected by
> this arbitrariness.
>
> Note also that
missing data approaches usually require the data to
> be missing at
random, which is presumably not the case in the
> problem at
hand.
>
> Best,
>
>
Philipp
>
> Am 28.11.2013 um 10:55 schrieb morphmet_modera...@morphometrics.org:
>
> >
> > ----- Forwarded message from
sebastien couette <sebastien.coue...@u-bourgogne.fr> -----
>
>
> > Date: Mon, 25 Nov 2013 05:06:20 -0500
> >
From: sebastien couette <sebastien.coue...@u-bourgogne.fr>
> >
Reply-To: sebastien couette <sebastien.coue...@u-bourgogne.fr>
> > Subject: Re: missing structures
> >
To: morphmet@morphometrics.org
>
>
> > Dear Patrick,
> > >
> > I published a paper on missing data in 2010:
> >
>
> > Sébastien Couette, Jess White (2010)3D geometric
morphometrics and
> > missing-data. Can extant taxa give clues
for the analysis of fossil
> > primates? Comptes Rendus Palevol 9(6):423-433.
>
> DOI:10.1016/j.crpv.2010.07.002
> > >
> >
I can send you a copy.
> > >
> > There is
also a good paper on this topic in Systbiol:
> > >
> > Brown, C.M., Arbour, J.H., Jackson,D.A.
(2012). testing the effect
> > of missing data estimation and
distribution in morphometric
> > multivariate data analyses.
Systematic biology,61(6),941-954.
> > >
> > Feel free to contac me if any
questions
> > >
> > Sébastien
>
>
> > --
> >
-----------------------------------------
> > Dr. Sébastien Couette
> >
> >
EPHE&UMR CNRS 6282 Biogéosciences
> > Université de
Bourgogne
> > 6 Bld Gabriel
> > 21000
Dijon
> >
> > Tél.: 33. (0)3.80.39.64.48
>
> Fax : 33. (0)3.80.39.63.87
> >
> >
> > Responsable de la
spécialité "Biodiversité et Gestion de l'Environnement" du Master "Biologie
Santé Ecologie" de l'EPHE
> >
> > Master EPHE spécialité "Biodiversité et
Gestion de l'Environnement"
> >
> > ----- End forwarded message
-----
> >
> >
>
>
___________________________________
>
> Dr. Philipp
Mitteroecker
>
> Department of Theoretical Biology
>
University of Vienna
> Althanstrasse 14
> A-1090
Vienna, Austria
>
> Tel: +43 1 4277 56705
> Fax: +43 1 4277 9544
> email: philipp.mitteroec...@univie.ac.at
>
> ----- End forwarded message
-----
>
>
----- End forwarded message
-----