Guillaume Louvel asked:

>
> So my first question is: can we directly apply the Mahalanobis distance
> to measure a kind of "phylogeny-corrected" distance between 2 vectors of
> trait values for a list of species? Since we assume a brownian motion,
> we know these vectors should be drawn from a multivariate normal
> distribution with known covariance matrix. Therefore the Mahalanobis
> distance seems perfectly appropriate to me, is it the case?

It is appropriate.  In fact this is in effect what
regressing contrasts in trait Y on contrasts in
trait X is doing.  One can alternatively use a
multivariate regression appoach, which is what
Alan Grafen (1989) did, and the results are
the same either way (in the simplest cases).

Note that although the contrasts can be
treated as independent observations, that is
not true for the tip species values  -- the
Grafen "Phylogenetic Regression" does not
treat the tip values as independent, and for
the same reason pairwise distances between
tips are not independent.


>
> I don't want to do a statistical test per se, I am rather interested in
> ranking many traits according to their distance to a pattern of reference.

I am unclear about what that means.

>
> My second subsidiary question is: can I apply this Mahalanobis distance
> if my traits are binary (e.g. presence-absence of some sequence in the
> genomes). In that case I know that my trait is not multivariate normal,
> but considering that I have millions of traits, shouldn't I expect the
> whole set to have some normal characteristics?

Basically no.  Although people have approximated
binary traits by Gaussian variables (I think Paul
Harvey and Mark Pagel did in their 1991 book),
it is much more appropriate to use a threshold
model.  See my 2012 paper in American Naturalist
or the earlier 2005 sketch of the method in
Proc. Royal Society of London series B.

A  good paper agonizing about all this is:

Maddison WP & FitzJohn RG. 2015. The unsolved challenge to
phylogenetic correlation tests for categorical characters. Systematic
Biology 64: 127–136

though I'd say that the problem is not as "unsolved" as they think.



> Finally, if none of the approach above is justified, is there a
> multivariate phylogenetic method for discrete/binary traits? Some kind
> of adapted phylogenetic PCA ?

See above.  It does require MCMC, and cannot
simply be done with distances.

J.F.
----
Joe Felsenstein         j...@gs.washington.edu
 Department of Genome Sciences and Department of Biology,
 University of Washington, Box 355065, Seattle, WA 98195-5065 USA

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to