Dear Andreas,

There is no distance formula for HKY or GTR model. For GTR, Rodrı́guez et al. developed a procedure to calculate a distance (also in Yang's 2006 book). An example is given below with the woodmouse:

matlog <- function(x) {
    tmp <- eigen(X)
    V <- tmp$vectors
    U <- diag(log(tmp$values))
    V %*% U %*% solve(V)

tr <- function(x) sum(diag(x))


PI <- diag(base.freq(woodmouse[1:2, ]))
Ft <- Ftab(woodmouse[1:2, ])
F <- Ft/sum(Ft)
X <- solve(PI) %*% F
-tr(PI %*% matlog(X))

You have to call this code for each pair of sequences (or wrap it in a function).

For HKY, Yang mentioned a procedure Rzhetsky & Nei (1994, J Mol Evol).



Rodrı́guez F., Oliver J. L., Marı́n A. & Medina J. R. 1990. The general stochastic model of nucleotide substitution. Journal of Theoretical
Biology 142: 485–501.

Le 01/02/2017 à 23:52, a écrit :
Dear Phylothusiasts,
I need to compare multiple substitution models side-by-side (species
clustering stuff by distances only). Unfortunately, I am not aware of an
implementation of HKY and GTR distance computations using R. Maybe there
is some Github code or something else that I have been missing?
I do not want to build a phylogenetic tree. I can not use PAUP (>500 big
fasta files).
Any ideas are greatly appreciated.
Best wishes,

R-sig-phylo mailing list -
Searchable archive at

R-sig-phylo mailing list -
Searchable archive at

Reply via email to