----- Forwarded message from Douglas Theobald <dtheob...@brandeis.edu> -----

Date: Sun, 1 Dec 2013 10:06:20 -0500
From: Douglas Theobald <dtheob...@brandeis.edu>
Reply-To: Douglas Theobald <dtheob...@brandeis.edu>
Subject: Re: missing structures
To: morphmet@morphometrics.org

The "missing data" problem is much misunderstood, and the technical
senses of "missing data" and "missing at random" do not correspond to
everyday, intuitive usage.

In fact, Patrick's problem can be validly discussed as a "missing
data" problem, in the statistical sense of the term.  For example, the
Expectation-Maximization algorithm is often characterized as dealing
with "missing data", and the EM algorithm can deal elegantly with data
that exist but were not measured, which is the more intuitive,
non-technical sense of "missing data".  However, the EM algorithm can
also deal with data that are unobserved because they do not exist
(this is the "qualitative" difference Philipp mentions).  In this case
you can view the EM algorithm as a mathematical trick that gets the
right answer by pretending that the data are missing in the common
sense way.  

For the EM algorithm (and many other statistical data imputation
methods) to be valid, the data must be "missing at random" (MAR), as
Philipp says.  But the technical definition of MAR does not correspond
to the intuitive sense of "random".  Missing morphological data often
is MAR -- for instance, Patrick's "missing data", in which certain
landmarks are unobserved because they never develop in the first
place, are MAR and hence can be validly treated by the EM algorithm. 
MAR only requires that the probability of a data point being absent
depends only on observed data.  Clearly, in Patrick's case we can
determine whether a developmental landmark will be absent based only
on observed data (e.g., if we know the organism the data is from).  




> On Sun, Dec 1, 2013 at 3:33 AM, <morphmet_modera...@morphometrics.org> wrote:
> ----- Forwarded message from Philipp Mitteröcker <mitte...@univie.ac.at> -----
>      Date: Thu, 28 Nov 2013 11:26:34 -0500
>       From: Philipp Mitteröcker <mitte...@univie.ac.at>
>       Reply-To: Philipp Mitteröcker <mitte...@univie.ac.at>
>       Subject: Re: missing structures
> The problem raised by Patrick is not really a missing data problem.
> Missing data, in the technical sense, are structures or properties
> that do exist in the specimens but could not have been measured.
> Hence it can make some sense to estimate them. But when structures
> simply do not exist in some specimens, what does it mean to estimate
> them?
> In other words, if a structure is present in one group and absent in
> another group, these groups differ not only quantitatively but also
> qualitatively. Estimating the values, or letting landmarks overlap,
> means that a qualitative difference is -- arbitrarily -- substituted
> by a quantitative one. Many statistical results will be affected by
> this arbitrariness.
> Note also that missing data approaches usually require the data to
> be missing at random, which is presumably not the case in the
> problem at hand.
> Best,
> Philipp
> Am 28.11.2013 um 10:55 schrieb morphmet_modera...@morphometrics.org:
> >
> > ----- Forwarded message from sebastien couette <sebastien.coue...@u-bourgogne.fr> -----
> >
> >     Date: Mon, 25 Nov 2013 05:06:20 -0500
> >      From: sebastien couette <sebastien.coue...@u-bourgogne.fr>
> >      Reply-To: sebastien couette <sebastien.coue...@u-bourgogne.fr>
> >      Subject: Re: missing structures
> >
> > Dear Patrick,
> > >
> > I published a paper on missing data in 2010:
> > >
> > Sébastien Couette, Jess White (2010)3D geometric morphometrics and
> > missing-data. Can extant taxa give clues for the analysis of fossil
> > primates? Comptes Rendus Palevol 9(6):423-433.
> > DOI:10.1016/j.crpv.2010.07.002
> > >
> > I can send you a copy.
> > >
> > There is also a good paper on this topic in Systbiol:
> > >
> > Brown, C.M., Arbour, J.H., Jackson,D.A. (2012). testing the effect
> > of missing data estimation and distribution in morphometric
> > multivariate data analyses. Systematic biology,61(6),941-954.
> > >
> > Feel free to contac me if any questions
> > >
> > Sébastien
> >
> > --
> > -----------------------------------------
> > Dr. Sébastien Couette
> >
> > EPHE&UMR CNRS 6282 Biogéosciences
> > Université de Bourgogne
> > 6 Bld Gabriel
> > 21000 Dijon
> >
> > Tél.: 33. (0)3.80.39.64.48
> > Fax : 33. (0)3.80.39.63.87
> >
> >
> > Responsable de la spécialité "Biodiversité et Gestion de l'Environnement" du Master "Biologie Santé Ecologie" de l'EPHE
> >
> > Master EPHE spécialité "Biodiversité et Gestion de l'Environnement"
> >
> > ----- End forwarded message -----
> >
> >
> ___________________________________
> Dr. Philipp Mitteroecker
> Department of Theoretical Biology
> University of Vienna
> Althanstrasse 14
> A-1090 Vienna, Austria
> Tel: +43 1 4277 56705
> Fax: +43 1 4277 9544
> ----- End forwarded message -----



----- End forwarded message -----



Reply via email to