Dear all, I read somewhere (but can't find the source again) that finding the least squares of a response variable Y against a linear combination of explanatory variables X given a covariance matrix V (that is, doing a Phylogenetic Generalized Least Squares) is equivalent to minimizing the Mahalanobis distance of Y with the predicted values, which seems to make sense to me.
So my first question is: can we directly apply the Mahalanobis distance to measure a kind of "phylogeny-corrected" distance between 2 vectors of trait values for a list of species? Since we assume a brownian motion, we know these vectors should be drawn from a multivariate normal distribution with known covariance matrix. Therefore the Mahalanobis distance seems perfectly appropriate to me, is it the case? I don't want to do a statistical test per se, I am rather interested in ranking many traits according to their distance to a pattern of reference. My second subsidiary question is: can I apply this Mahalanobis distance if my traits are binary (e.g. presence-absence of some sequence in the genomes). In that case I know that my trait is not multivariate normal, but considering that I have millions of traits, shouldn't I expect the whole set to have some normal characteristics? I know that there is the Pagel's 1994 method for binary traits, however it seemed to me that a distance-based method would be faster, and would allow to order my candidates. Finally, if none of the approach above is justified, is there a multivariate phylogenetic method for discrete/binary traits? Some kind of adapted phylogenetic PCA ? Thanks a lot for your help, Guillaume
signature.asc
Description: OpenPGP digital signature
_______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/