Dear all,

I read somewhere (but can't find the source again) that finding the
least squares of a response variable Y against a linear combination of
explanatory variables X given a covariance matrix V (that is, doing a
Phylogenetic Generalized Least Squares) is equivalent to minimizing the
Mahalanobis distance of Y with the predicted values, which seems to make
sense to me.

So my first question is: can we directly apply the Mahalanobis distance
to measure a kind of "phylogeny-corrected" distance between 2 vectors of
trait values for a list of species? Since we assume a brownian motion,
we know these vectors should be drawn from a multivariate normal
distribution with known covariance matrix. Therefore the Mahalanobis
distance seems perfectly appropriate to me, is it the case?

I don't want to do a statistical test per se, I am rather interested in
ranking many traits according to their distance to a pattern of reference.

My second subsidiary question is: can I apply this Mahalanobis distance
if my traits are binary (e.g. presence-absence of some sequence in the
genomes). In that case I know that my trait is not multivariate normal,
but considering that I have millions of traits, shouldn't I expect the
whole set to have some normal characteristics?

I know that there is the Pagel's 1994 method for binary traits, however
it seemed to me that a distance-based method would be faster, and would
allow to order my candidates.

Finally, if none of the approach above is justified, is there a
multivariate phylogenetic method for discrete/binary traits? Some kind
of adapted phylogenetic PCA ?

Thanks a lot for your help,

Guillaume


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to