Guillaume Louvel asked: > > So my first question is: can we directly apply the Mahalanobis distance > to measure a kind of "phylogeny-corrected" distance between 2 vectors of > trait values for a list of species? Since we assume a brownian motion, > we know these vectors should be drawn from a multivariate normal > distribution with known covariance matrix. Therefore the Mahalanobis > distance seems perfectly appropriate to me, is it the case?
It is appropriate. In fact this is in effect what regressing contrasts in trait Y on contrasts in trait X is doing. One can alternatively use a multivariate regression appoach, which is what Alan Grafen (1989) did, and the results are the same either way (in the simplest cases). Note that although the contrasts can be treated as independent observations, that is not true for the tip species values -- the Grafen "Phylogenetic Regression" does not treat the tip values as independent, and for the same reason pairwise distances between tips are not independent. > > I don't want to do a statistical test per se, I am rather interested in > ranking many traits according to their distance to a pattern of reference. I am unclear about what that means. > > My second subsidiary question is: can I apply this Mahalanobis distance > if my traits are binary (e.g. presence-absence of some sequence in the > genomes). In that case I know that my trait is not multivariate normal, > but considering that I have millions of traits, shouldn't I expect the > whole set to have some normal characteristics? Basically no. Although people have approximated binary traits by Gaussian variables (I think Paul Harvey and Mark Pagel did in their 1991 book), it is much more appropriate to use a threshold model. See my 2012 paper in American Naturalist or the earlier 2005 sketch of the method in Proc. Royal Society of London series B. A good paper agonizing about all this is: Maddison WP & FitzJohn RG. 2015. The unsolved challenge to phylogenetic correlation tests for categorical characters. Systematic Biology 64: 127–136 though I'd say that the problem is not as "unsolved" as they think. > Finally, if none of the approach above is justified, is there a > multivariate phylogenetic method for discrete/binary traits? Some kind > of adapted phylogenetic PCA ? See above. It does require MCMC, and cannot simply be done with distances. J.F. ---- Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/