Re: [R-sig-phylo] Problem with "ace" in package "ape"
Hi Miguel, You are not using ace() correctly. If 'model' is a matrix it must be an "index" matrix, not a rate matrix. This is explained in the help page. In your case, you could leave the default model="ER". Note that this seems similar to fitting a JC69 model, so you may consider using phangorn as an alternative. Best, Emmanuel -Original Message- From: Miguel Lacerda Sender: r-sig-phylo-boun...@r-project.org Date: Fri, 14 Sep 2012 17:26:10 To: Subject: [R-sig-phylo] Problem with "ace" in package "ape" Hi there, I would like to extract the conditional likelihoods of a phylogenetic model of discrete character evolution at each of the ancestral nodes in a tree. I have attempted to do this with the function "ace" in the R package "ape". However, I am getting the following error message: library(phytools) set.seed(100) mytree <- rtree(50) Q <- matrix(c(-3,1,1,1, 1,-3,1,1, 1,1,-3,1 ,1,1,1,-3),4,4)/100 rownames(Q) <- colnames(Q) <- c("A","C","G","T") mymap <- sim.history(mytree, Q) tip.states <- mymap$states ace(factor(tip.states, levels=c("A","C","G","T")), mytree, type="discrete", model=Q) Error in nlminb(rep(ip, length.out = np), function(p) dev(p), lower = rep(0, : 'd' must be a nonempty numeric vector The error seems to occur whenever the number of substitution events is small. If I do not divide the Q matrix by 100 in the example above (so that more character changes are simulated along the phylogeny and hence more diversity is observed at the tips), ace works fine. If all the tip states are identical, ace crashes with the error message above. Is there any other function available to extract the conditional likelihoods at each ancestral node? Thanks, Miguel [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Ouch v.2.8-2 on the same tree gives different results (maybe)
Dear Elena, If I understand you correctly, you are trying to fit a unique regime to each branch of the tree? This is not advisable, as there are more branches than there are more branches than there are terminal taxa (data points). So you are trying to fit many more parameters than you have data points. In a hansen model, there are parameters for the strength of selection (alpha) the magnitude of drift (sigma) and each optima (4 for a 3 taxon tree). That is 6 parameters for three data points. It is not at all surprising that you get non convergence or different results, as the problem is not identifiable. We recommend using the minimum number of selective regimes to describe the major features of the data. Generally there are only a few regimes for many species, with many branches of the tree in the same regime. Also you should avoid tinkering with the inner workings of S4 objects. You should use them via their accessors as given in the help pages. Please try the examples at the bottom of the included datasets: require(ouch) data(bimac) ?bimac data(anolis.ssd) ?anolis.ssd Marguerite On Sep 14, 2012, at 5:51 AM, Elena Grassi wrote: > Hello, > > I'm a newbie in phylogenetic analysis so excuse me in advance for > incorrect terms and so on (or if my question is too silly). > I've recently started to work on a project for which I need to use > Ornstein-Uhlenbeck models and I decided to use R and ouch since > the documentation was great. > For the first experiments I have been working on some toy trees but > I'm getting puzzling results: the same tree, simply > defined with different orders of the string which I use to represent > it in the Newick format, when analyzed with the Hansen > functions gives different results. > I have tried different combinations and in some cases a process > converges while the other one does not, in other > cases what change are the chosen optimal sigma (and thus the > loglikelihoods and so on). This happens > when I use a regime with different levels for every node in the tree > but not when I use a single regime. > > I'm pasting here an example of this phenomenon. It could be a stupid > error on my side but I'm not able to spot it right now. > > Thanks, > Elena Grassi > " >> library("ouch") >> library("ape") >> t1 <- read.tree(text="(E:10,(D:5,C:5):5);") >> t2 <- read.tree(text="((C:5,D:5):5,E:10);") >> out1 <- ape2ouch(t1) >> out2 <- ape2ouch(t2) >> regime1 <- out@nodes >> regime1 <- out1@nodes >> regime2 <- rep("global", length(outree@nodes)) >> regime2 <- rep("global", length(out2@nodes)) >> expr1 <- c(NA, NA, 3, 5, 10) >> expr2 <- c(NA, NA, 10, 5, 3) >> models_data1 <- data.frame(expr1, regime1, regime2, row.names=out1@nodes) >> models_data2 <- data.frame(expr2, regime1, regime2, row.names=out2@nodes) >> m1 <- hansen(models_data1["expr1"], out1, models_data1["regime1"], 1, 1) >> m2 <- hansen(models_data2["expr2"], out2, models_data2["regime1"], 1, 1) > unsuccessful convergence, code 10, see documentation for `optim' > Warning message: > In hansen(models_data2["expr2"], out2, models_data2["regime1"], : > unsuccessful convergence >> m1 > > call: > hansen(data = models_data1["expr1"], tree = out1, regimes = > models_data1["regime1"], >sqrt.alpha = 1, sigma = 1) > nodes ancestors times labels regime1 expr1 > 1 1 0.0 1NA > 2 2 1 0.5 2NA > 3 3 2 1.0 C 3 3 > 4 4 2 1.0 D 4 5 > 5 5 1 1.0 E 510 > > alpha: > [,1] > [1,] 3.734194 > > sigma squared: > [,1] > [1,] 9.233622e-30 > > theta: > $expr1 > 1 2 3 4 5 > 0.5049171 1.3917440 3.3191037 5.6847686 10.2324134 > > loglik deviance aic aic.c sic dof > 100.5415 -201.0831 -187.0831 -209.4831 -193.39287. > >> m22 <- hansen(models_data2["expr2"], out2, models_data2["regime2"], 1, 1) >> m11 <- hansen(models_data1["expr1"], out1, models_data1["regime2"], 1, 1) >> m11@loglik > [1] -7.47607 >> m22@loglik > [1] -7.47607 > >> sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: i486-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 > [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 > LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ape_3.0-5 ouch_2.8-2subplex_1.1-3 > > loaded via a namespace (and not attached): > [1] gee_4.13-18 grid_2.15.1 lattice_0.20-10 nlme_3.1-104 > tools_2.15.1 > " > > -- > www.biocut.unito.it > > ___ > R-sig-phylo mailing list > R-sig-phylo@r-project.org > https://stat.ethz.
[R-sig-phylo] Problem with "ace" in package "ape"
Hi there, I would like to extract the conditional likelihoods of a phylogenetic model of discrete character evolution at each of the ancestral nodes in a tree. I have attempted to do this with the function "ace" in the R package "ape". However, I am getting the following error message: library(phytools) set.seed(100) mytree <- rtree(50) Q <- matrix(c(-3,1,1,1, 1,-3,1,1, 1,1,-3,1 ,1,1,1,-3),4,4)/100 rownames(Q) <- colnames(Q) <- c("A","C","G","T") mymap <- sim.history(mytree, Q) tip.states <- mymap$states ace(factor(tip.states, levels=c("A","C","G","T")), mytree, type="discrete", model=Q) Error in nlminb(rep(ip, length.out = np), function(p) dev(p), lower = rep(0, : 'd' must be a nonempty numeric vector The error seems to occur whenever the number of substitution events is small. If I do not divide the Q matrix by 100 in the example above (so that more character changes are simulated along the phylogeny and hence more diversity is observed at the tips), ace works fine. If all the tip states are identical, ace crashes with the error message above. Is there any other function available to extract the conditional likelihoods at each ancestral node? Thanks, Miguel [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] Ouch v.2.8-2 on the same tree gives different results (maybe)
Hello, I'm a newbie in phylogenetic analysis so excuse me in advance for incorrect terms and so on (or if my question is too silly). I've recently started to work on a project for which I need to use Ornstein-Uhlenbeck models and I decided to use R and ouch since the documentation was great. For the first experiments I have been working on some toy trees but I'm getting puzzling results: the same tree, simply defined with different orders of the string which I use to represent it in the Newick format, when analyzed with the Hansen functions gives different results. I have tried different combinations and in some cases a process converges while the other one does not, in other cases what change are the chosen optimal sigma (and thus the loglikelihoods and so on). This happens when I use a regime with different levels for every node in the tree but not when I use a single regime. I'm pasting here an example of this phenomenon. It could be a stupid error on my side but I'm not able to spot it right now. Thanks, Elena Grassi " > library("ouch") > library("ape") > t1 <- read.tree(text="(E:10,(D:5,C:5):5);") > t2 <- read.tree(text="((C:5,D:5):5,E:10);") > out1 <- ape2ouch(t1) > out2 <- ape2ouch(t2) > regime1 <- out@nodes > regime1 <- out1@nodes > regime2 <- rep("global", length(outree@nodes)) > regime2 <- rep("global", length(out2@nodes)) > expr1 <- c(NA, NA, 3, 5, 10) > expr2 <- c(NA, NA, 10, 5, 3) > models_data1 <- data.frame(expr1, regime1, regime2, row.names=out1@nodes) > models_data2 <- data.frame(expr2, regime1, regime2, row.names=out2@nodes) > m1 <- hansen(models_data1["expr1"], out1, models_data1["regime1"], 1, 1) > m2 <- hansen(models_data2["expr2"], out2, models_data2["regime1"], 1, 1) unsuccessful convergence, code 10, see documentation for `optim' Warning message: In hansen(models_data2["expr2"], out2, models_data2["regime1"], : unsuccessful convergence > m1 call: hansen(data = models_data1["expr1"], tree = out1, regimes = models_data1["regime1"], sqrt.alpha = 1, sigma = 1) nodes ancestors times labels regime1 expr1 1 1 0.0 1NA 2 2 1 0.5 2NA 3 3 2 1.0 C 3 3 4 4 2 1.0 D 4 5 5 5 1 1.0 E 510 alpha: [,1] [1,] 3.734194 sigma squared: [,1] [1,] 9.233622e-30 theta: $expr1 1 2 3 4 5 0.5049171 1.3917440 3.3191037 5.6847686 10.2324134 loglik deviance aic aic.c sic dof 100.5415 -201.0831 -187.0831 -209.4831 -193.39287. > m22 <- hansen(models_data2["expr2"], out2, models_data2["regime2"], 1, 1) > m11 <- hansen(models_data1["expr1"], out1, models_data1["regime2"], 1, 1) > m11@loglik [1] -7.47607 > m22@loglik [1] -7.47607 > sessionInfo() R version 2.15.1 (2012-06-22) Platform: i486-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ape_3.0-5 ouch_2.8-2subplex_1.1-3 loaded via a namespace (and not attached): [1] gee_4.13-18 grid_2.15.1 lattice_0.20-10 nlme_3.1-104 tools_2.15.1 " -- www.biocut.unito.it ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Problem with ace/anc.ML
Hi Zachary. This is probably because (when alpha is large) during optimization of sig^2, the values in the matrix are so close to the limits of numerical precision that the among species VCV matrix becomes singular. Here's another option that should give the same estimates and may work even when alpha is quite big. It takes advantage of the fact that the root node state computed during Felsenstein's PIC algorithm is also the MLE for that node. For fully bifurcating phylogeny, tree, & data in x: fit<-fitContinuous(tree,x,model="OU")$Trait1 ou<-vector(); N<-length(tree$tip); M<-tree$Nnode outree<-ouTree(tree,fit$alpha) for(i in 1:M+N){ ou[i-N]<-ace(x,multi2di(root(tree,node=i)),method="pic")$ace[1] names(ou)[i-N]<-i } Try that. For low alpha or the BM model (alpha=0) this should give you the same estimates (to a fairly high degree of numerical precision - remember, the MLEs are obtained numerically, so are inexact) of ace(...,"ML"), but when likelihood can't be optimized due to numerical precision, this may still work. Note that, for a reason not totally clear to me, multi2di(root(...)) & root(...,resolve.root=TRUE) don't produce the same rooted phylogenies. The former, not the latter, should be used in this case. - Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://phytools.blogspot.com On 9/13/2012 4:55 PM, Zachary Chejanovski wrote: Hi, I am attempting to estimate the ancestral states in a continuous character under a single-optimum OU model. Following your latest post, I used geiger::fitContinuous(model="OU") and transformed the branch lengths using ouTree(…) Unfortunately on putting these into ace I get an error message: Error in nlm(function(p) dev.BM(p), p = c(1, rep(mean(x), nb.node)), hessian = TRUE) : non-finite value supplied by 'nlm' And when I try using anc.ML I get this error message: Error in solve.default(C) : system is computationally singular: reciprocal condition number = 1.75196e-148 It works with my original data but for some reason the transformation of the branch lengths is giving odd values. The alpha that fitContinuous gives me is 3.831327. Here are the first few of the original branch lengths: 1] 10.82 26.39 18.01 18.01 14.40 And here are those values transformed using ouTree: [1] 2.468450e-149 1.638125e-61 1.408962e-01 1.408962e-01 2.057990e-101 Thanks in advance, Zac [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo