Dear Andreas,
There is no distance formula for HKY or GTR model. For GTR, Rodrı́guez
et al. developed a procedure to calculate a distance (also in Yang's
2006 book). An example is given below with the woodmouse:
matlog <- function(x) {
tmp <- eigen(X)
V <- tmp$vectors
U <- diag(log(tmp$values))
V %*% U %*% solve(V)
}
tr <- function(x) sum(diag(x))
data(woodmouse)
PI <- diag(base.freq(woodmouse[1:2, ]))
Ft <- Ftab(woodmouse[1:2, ])
F <- Ft/sum(Ft)
X <- solve(PI) %*% F
-tr(PI %*% matlog(X))
You have to call this code for each pair of sequences (or wrap it in a
function).
For HKY, Yang mentioned a procedure Rzhetsky & Nei (1994, J Mol Evol).
Best,
Emmanuel
Rodrı́guez F., Oliver J. L., Marı́n A. & Medina J. R. 1990. The general
stochastic model of nucleotide substitution. Journal of Theoretical
Biology 142: 485–501.
Le 01/02/2017 à 23:52, kolte...@rub.de a écrit :
Dear Phylothusiasts,
I need to compare multiple substitution models side-by-side (species
clustering stuff by distances only). Unfortunately, I am not aware of an
implementation of HKY and GTR distance computations using R. Maybe there
is some Github code or something else that I have been missing?
I do not want to build a phylogenetic tree. I can not use PAUP (>500 big
fasta files).
Any ideas are greatly appreciated.
Best wishes,
Andreas
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/