Hi Martin,

Rates are generally relative quantities in phylogenetics. You cannot estimate 
the substitution rate(s) together with the branch lengths of the tree, but 
rates can be estimated relative to each others. That's why one rate is fixed to 
1 in the GTR model.

In the JC69 model the (single) rate cannot be estimated, unless you fix the 
branch lengths, which can be done in phangorn::optim.pml with the options 
optRate=TRUE, optEdge=FALSE (if you set both options to TRUE, you'll get a 
warning). If you do this, the output will include an element '...$rate' with 
the estimated substitution rate.

The rate matrix can have elements larger than 1, and it's its sums by row which 
are equal to 0. Generally in phylogenetics, there is no time information, so 
the branch lengths are interpreted in expected numbers of substitutions along 
each branch. You can still do the matrix exponential (e^(Q*t)) but the 
resulting probabilities cannot be interpreted easily.

Best,

Emmanuel

----- Le 26 Nov 21, à 18:04, Martin Fikáček mfika...@gmail.com a écrit :

> Hi everybody,
> 
> I am now trying to explain the principles of phylogenetics to the students
> using R and went into a very simple problem that I cannot solve. Probably a
> very simple and basic thing, sorry for a stupid questions:
> 
> When checking the details of models selected for my data by modelTest() in
> phangorn, the rate matrix always includes number around 1 or even mich
> higher (for example this is the matrix for Laurasiatherian data with
> GTR+I+G model:
> 
> Rate matrix:
>          a          c          g         t
> a  0.000000  3.0009884 11.8735854  2.608831
> c  3.000988  0.0000000  0.5162325 21.771813
> g 11.873585  0.5162325  0.0000000  1.000000
> t  2.608831 21.7718125  1.0000000  0.000000
> 
> For some simple models it gives just 0 or 1 as for example this for JC:
> 
> Rate matrix:
>  a c g t
> a 0 1 1 1
> c 1 0 1 1
> g 1 1 0 1
> t 1 1 1 0
> 
> I would normally expect the rate matric to have values lower than 1, and to
> sum up to 0. Then it would make sense to use it also for calculating the
> probability matrix using e^(Q*t). I wanted to illustrate the meaning of the
> rate matrix estimated for real data to the students in this way, which is
> why I realized that the output by phangorn is different and I fail to find
> out why.
> 
> Thanks for any hint!
> 
> Martin
> 
> --
> *Martin Fikáček (費卡契) MSc. PhD.*
> *Department of Biological Sciences*
> *National Sun Yat-sen University*
> *No. 70, Lienhai Rd., Kaohsiung 80424, Taiwan*
> *E-mail: *mfika...@gmail.com, mfika...@mail.nsysu.edu.tw
> *Phone: *(+886) 75252000 # 3622
> *Website: *www.cercyon.eu
> 
>       [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to