Dear Andreas,

There is no distance formula for HKY or GTR model. For GTR, Rodrı́guez et al. developed a procedure to calculate a distance (also in Yang's 2006 book). An example is given below with the woodmouse:

matlog <- function(x) {
    tmp <- eigen(X)
    V <- tmp$vectors
    U <- diag(log(tmp$values))
    V %*% U %*% solve(V)
}

tr <- function(x) sum(diag(x))

data(woodmouse)

PI <- diag(base.freq(woodmouse[1:2, ]))
Ft <- Ftab(woodmouse[1:2, ])
F <- Ft/sum(Ft)
X <- solve(PI) %*% F
-tr(PI %*% matlog(X))

You have to call this code for each pair of sequences (or wrap it in a function).

For HKY, Yang mentioned a procedure Rzhetsky & Nei (1994, J Mol Evol).

Best,

Emmanuel

Rodrı́guez F., Oliver J. L., Marı́n A. & Medina J. R. 1990. The general stochastic model of nucleotide substitution. Journal of Theoretical
Biology 142: 485–501.


Le 01/02/2017 à 23:52, kolte...@rub.de a écrit :
Dear Phylothusiasts,
I need to compare multiple substitution models side-by-side (species
clustering stuff by distances only). Unfortunately, I am not aware of an
implementation of HKY and GTR distance computations using R. Maybe there
is some Github code or something else that I have been missing?
I do not want to build a phylogenetic tree. I can not use PAUP (>500 big
fasta files).
Any ideas are greatly appreciated.
Best wishes,
Andreas

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/






_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to