Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character
Dear Liam, Many thanks for your message and the clarification - that was indeed not clear to me from the ace help page! Would ancRECON in package corHMM also be OK by any chance for my purposes? I see that that one also allows one to specify the prior for the root and I am dealing with binary characters with symmetric transition rates... Cheers thanks again for the advice! Tom -Original Message- From: Liam J. Revell [mailto:liam.rev...@umb.edu] Sent: 30 July 2013 05:31 To: Tom Wenseleers Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Hi Tom. This was the subject of discussion recently on this list. ace does not do marginal ancestral state reconstruction (which is probably what you want) - it computes the conditional scaled likelihoods of the subtree. These are the same as the marginal reconstructions only at the root node. If your transition matrix is symmetric, then you can get the marginal reconstructions by rerooting at all the internal nodes. This is in the phytools function rerootingMethod (http://www.phytools.org/static.help/rerootingMethod.html). If you want to use a more complicated model, you will have to use another package - such as diversitree. An alternative is to use stochastic mapping and then compute the posterior frequencies from the sample of stochastic maps. This makes it easy to put an explicit prior on the root and to integrate over uncertainty in the transition matrix. This is implemented in phytools also (http://www.phytools.org/static.help/make.simmap.html). All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 7/29/2013 5:45 PM, Tom Wenseleers wrote: Dear all, @Arne: yes I think it has to do something with the priors for the root. I'm not sure what prior ace uses - I think equal, which in my case would not be so appropriate given that nearly all species have the trait.Would anyone know by any chance whether in ape it is possible to haveace use a prior for the root which would reflect the frequency at the tips, and if so, how one could specify this? Cheers, Tom *From:*Arne Mooers [mailto:amoo...@sfu.ca] *Sent:* 29 July 2013 20:10 *To:* Tom Wenseleers *Subject:* Question on ace ML reconstruction of discrete binary character Hoi Tom, What is the default prior on the root in ace? Different approaches use different priors (=observed frequency at tips, equal, equal to tested ratio of q's, etc.) That has had a big affect on reconstructions I have done in the past. Cheers, Arne Mooers Begin forwarded message: *From: *Tom Wenseleers tom.wensele...@bio.kuleuven.be mailto:tom.wensele...@bio.kuleuven.be *Date: *29 July, 2013 9:00:28 AM PDT *To: *r-sig-phylo@r-project.org mailto:r-sig-phylo@r-project.org r-sig-phylo@r-project.org mailto:r-sig-phylo@r-project.org *Subject: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character* Dear all, I just did some ancestral state reconstructions of binary characters (screenshot attached) using ace (using an equal rate discrete character reconstruction) . Everything seems to make sense to me, except the two basal nodes, where I end up with quite low likelihoods for my red character being 1 (cf. the pie charts), even though I get higher likelihoods at practically all of the more shallow nodes in the tree. Any ideas why one can get a result like this, and what I could potentially do about it, since it doesn't seem quite right to me? Cheers, Tom /_ __/ /Prof. Tom Wenseleers/ */ Lab. of Socioecology and Social Evolution/ / Dept. of Biology/ / Zoological Institute/ / K.U.Leuven/ ///Naamsestraat 59, box 2466/ / B-3000 Leuven/ / Belgium /(/+32 (0)16 32 39 64 / +32 (0)472 40 45 96/ 8/tom.wensele...@bio.kuleuven.be mailto:tom.wensele...@bio.kuleuven.be/ */http://bio.kuleuven.be/ento/wenseleers/twenseleers.htm/* ___ R-sig-phylo mailing list -R-sig-phylo@r-project.org mailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive athttp://www.mail-archive.com/r-sig-phylo@r-project.org/ _ Dr. Arne Mooers Biology, Simon Fraser University University Drive., Burnaby BC V5A 1S6 Canada amoo...@sfu.ca mailto:amoo...@sfu.ca +1 778 782 3979 skype: arnemooers www.sfu.ca/~amooers http://www.sfu.ca/~amooers www.sfu.ca/fabstar http://www.sfu.ca/fabstar www.scientists-4-species.org http://www.scientists-4-species.org hesp.irmacs.sfu.ca 7billionandyou.org ___ R-sig-phylo mailing
Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character
Dear all, Many thanks for all your advice so far. I have now moved to using rayDISC in package corHMM to reconstruct marginal maximum likelihood ancestral state reconstructions, using the method of Maddison et al (2007) and FitzJohn et al (2009) to fix the prior probabilities at the root (setting it to the observed frequency at the tips doesn't change much). The code I have is library(ape) library(corHMM) tree=read.tree(http://www.kuleuven.be/bio/ento/temp/tree.tre;) data=read.csv(file=http://www.kuleuven.be/bio/ento/temp/data.csv;) rownames(data)=data[,1] ASR=rayDISC(tree,data,ntraits=1,charnum=1,model=ER,node.states=marginal,root.p=maddfitz) plot(tree, cex=0.6, show.tip.label=TRUE, ljoin=2,lend=2,label.offset=0.02) nodelabels(pie=ASR$states,piecol=c(white,red), cex=0.45) tiplabels(pch = 22, bg = ifelse(data[tree$tip.label, ][,2],red,white), col=black,adj = c(0.51, 0.5), cex = 0.6) I still get unusually low marginal ML values for the trait being 1 at the basal nodes though (ca. 0.7, which is very low considering that 89% of my species have the trait). Would anyone be able to offer advice on why one could get the reconstructed root ML value to be so much lower than the actual observed frequency of the trait at the tips, and what could be a solution to obtaining a more realistic ML reconstruction? (I also tried diversitree and phangorn, but they all give similar results) Cheers, Tom -Original Message- From: Jack Viljoen [mailto:javilj...@gmail.com] Sent: 30 July 2013 10:23 To: Tom Wenseleers Subject: Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Hello, Tom. I was just wondering if the higher uncertainty at the basal nodes isn't expected, particularly given the long branches descended from them? Since this is an ML estimate and not a Bayesian one, surely the concept of priors does not apply? My understanding is that ace() actually only estimates the root node and that other methods are required to properly estimate the states at the other nodes. I'm basing this on these posts from Liam Revell earlier this year: http://blog.phytools.org/2013/03/conditional-scaled-likelihoods-in-ace.html http://blog.phytools.org/2013/03/a-little-more-on-ancestral-state.html I hope those links shed some light on the matter, or that someone who knows about this stuff has responded to you off-list as well. Good luck, Jack -- Message: 1 Date: Mon, 29 Jul 2013 16:00:28 + From: Tom Wenseleers tom.wensele...@bio.kuleuven.be To: r-sig-phylo@r-project.org r-sig-phylo@r-project.org Subject: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Message-ID: 37efc97028f3e44082acc5cbec00563011294...@icts-s-mbx7.luna.kuleuven.be Content-Type: text/plain; charset=us-ascii Dear all, I just did some ancestral state reconstructions of binary characters (screenshot attached) using ace (using an equal rate discrete character reconstruction) . Everything seems to make sense to me, except the two basal nodes, where I end up with quite low likelihoods for my red character being 1 (cf. the pie charts), even though I get higher likelihoods at practically all of the more shallow nodes in the tree. Any ideas why one can get a result like this, and what I could potentially do about it, since it doesn't seem quite right to me? Cheers, Tom __ _ Prof. Tom Wenseleers * Lab. of Socioecology and Social Evolution Dept. of Biology Zoological Institute K.U.Leuven Naamsestraat 59, box 2466 B-3000 Leuven Belgium * +32 (0)16 32 39 64 / +32 (0)472 40 45 96 * tom.wensele...@bio.kuleuven.be http://bio.kuleuven.be/ento/wenseleers/twenseleers.htm -- next part -- An HTML attachment was scrubbed... URL: https://stat.ethz.ch/pipermail/r-sig-phylo/attachments/20130729/609e4 f89/attachment-0001.html -- next part -- A non-text attachment was scrubbed... Name: ace ML reconstruction.jpg Type: image/jpeg Size: 196149 bytes Desc: ace ML reconstruction.jpg URL: https://stat.ethz.ch/pipermail/r-sig-phylo/attachments/20130729/609e4 f89/attachment-0001.jpg -- Message: 2 Date: Mon, 29 Jul 2013 21:45:53 + From: Tom Wenseleers tom.wensele...@bio.kuleuven.be To: r-sig-phylo@r-project.org r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Question on ace ML reconstruction of discretebinary character Message-ID: 37efc97028f3e44082acc5cbec00563011294...@icts-s-mbx7.luna.kuleuven.be Content-Type: text/plain; charset=us-ascii Dear all, @Arne: yes I think it has to do something with the priors for the root. I'm not sure what prior ace uses - I think equal
Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character
Hi Tom. There is no reason to expect that the marginal ancestral state reconstructions at the root node (empirical Bayesian posterior probabilities) should match your tip frequencies or prior probabilities. Imagine the following scenario: you have one diverse clade comprising 50% of extant taxa that all diverged recently from a common ancestor share state B; whereas state A is found in all the other tips of the tree, some of which are in clades originating near the root. We would not expect posterior probabilities at the root node to mach the empirical frequencies of our state at the tips (50:50). In fact, we might expect that our reconstructed state at the root of the tree would be strongly A. In your specific case, state 0 is found in three clades that originate nearer to the root, whereas more nested clades are exclusively in state 1. This is why - in spite of its relative rarity across the tips of the tree - there is still some reasonable (PP~0.3) posterior probability under the model that the root is in state 0. This is not an error that needs to be corrected - it is just what your data, model, and tree tell us about the ancestral node of the phylogeny. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 7/30/2013 6:10 AM, Tom Wenseleers wrote: Dear all, Many thanks for all your advice so far. I have now moved to using rayDISC in package corHMM to reconstruct marginal maximum likelihood ancestral state reconstructions, using the method of Maddison et al (2007) and FitzJohn et al (2009) to fix the prior probabilities at the root (setting it to the observed frequency at the tips doesn't change much). The code I have is library(ape) library(corHMM) tree=read.tree(http://www.kuleuven.be/bio/ento/temp/tree.tre;) data=read.csv(file=http://www.kuleuven.be/bio/ento/temp/data.csv;) rownames(data)=data[,1] ASR=rayDISC(tree,data,ntraits=1,charnum=1,model=ER,node.states=marginal,root.p=maddfitz) plot(tree, cex=0.6, show.tip.label=TRUE, ljoin=2,lend=2,label.offset=0.02) nodelabels(pie=ASR$states,piecol=c(white,red), cex=0.45) tiplabels(pch = 22, bg = ifelse(data[tree$tip.label, ][,2],red,white), col=black,adj = c(0.51, 0.5), cex = 0.6) I still get unusually low marginal ML values for the trait being 1 at the basal nodes though (ca. 0.7, which is very low considering that 89% of my species have the trait). Would anyone be able to offer advice on why one could get the reconstructed root ML value to be so much lower than the actual observed frequency of the trait at the tips, and what could be a solution to obtaining a more realistic ML reconstruction? (I also tried diversitree and phangorn, but they all give similar results) Cheers, Tom -Original Message- From: Jack Viljoen [mailto:javilj...@gmail.com] Sent: 30 July 2013 10:23 To: Tom Wenseleers Subject: Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Hello, Tom. I was just wondering if the higher uncertainty at the basal nodes isn't expected, particularly given the long branches descended from them? Since this is an ML estimate and not a Bayesian one, surely the concept of priors does not apply? My understanding is that ace() actually only estimates the root node and that other methods are required to properly estimate the states at the other nodes. I'm basing this on these posts from Liam Revell earlier this year: http://blog.phytools.org/2013/03/conditional-scaled-likelihoods-in-ace.html http://blog.phytools.org/2013/03/a-little-more-on-ancestral-state.html I hope those links shed some light on the matter, or that someone who knows about this stuff has responded to you off-list as well. Good luck, Jack -- Message: 1 Date: Mon, 29 Jul 2013 16:00:28 + From: Tom Wenseleers tom.wensele...@bio.kuleuven.be To: r-sig-phylo@r-project.org r-sig-phylo@r-project.org Subject: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Message-ID: 37efc97028f3e44082acc5cbec00563011294...@icts-s-mbx7.luna.kuleuven.be Content-Type: text/plain; charset=us-ascii Dear all, I just did some ancestral state reconstructions of binary characters (screenshot attached) using ace (using an equal rate discrete character reconstruction) . Everything seems to make sense to me, except the two basal nodes, where I end up with quite low likelihoods for my red character being 1 (cf. the pie charts), even though I get higher likelihoods at practically all of the more shallow nodes in the tree. Any ideas why one can get a result like this, and what I could potentially do about it, since it doesn't seem quite right to me? Cheers, Tom __ _ Prof. Tom
Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character
Oops. Sorry the citation is Schluter, Price, Mooers, Ludwig 1997. Likelihood of ancestor states in adaptive radiation. Evolution 51:1699-1711. This issue has been known for a long time. On Jul 30, 2013, at 6:51 AM, Marguerite Butler mbutler...@gmail.com wrote: Hi Tom, One thing to keep in mind is the information content of the data relative to what you are trying to infer. Basically, you have data only at the tips, but are trying to infer the state of the root deep in the tree. So therefore there is actually very little information being brought to bear on this problem. In this case, whatever answer you get will very strongly reflect the assumptions of the model that you apply and the structure of the tree. Put another way, if you were to construct error bars around this character state estimate, you would see that they are huge (See Moers et al. 1997 in Evolution). It sounds like you are expecting a linear parsimony reconstruction. Why not just use that? Your character does not change very much on the tree. This is basically what your ML answer is telling you anyway, more than 50% chance of red at the base. Marguerite On Jul 30, 2013, at 4:38 AM, Liam J. Revell liam.rev...@umb.edu wrote: Hi Tom. There is no reason to expect that the marginal ancestral state reconstructions at the root node (empirical Bayesian posterior probabilities) should match your tip frequencies or prior probabilities. Imagine the following scenario: you have one diverse clade comprising 50% of extant taxa that all diverged recently from a common ancestor share state B; whereas state A is found in all the other tips of the tree, some of which are in clades originating near the root. We would not expect posterior probabilities at the root node to mach the empirical frequencies of our state at the tips (50:50). In fact, we might expect that our reconstructed state at the root of the tree would be strongly A. In your specific case, state 0 is found in three clades that originate nearer to the root, whereas more nested clades are exclusively in state 1. This is why - in spite of its relative rarity across the tips of the tree - there is still some reasonable (PP~0.3) posterior probability under the model that the root is in state 0. This is not an error that needs to be corrected - it is just what your data, model, and tree tell us about the ancestral node of the phylogeny. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 7/30/2013 6:10 AM, Tom Wenseleers wrote: Dear all, Many thanks for all your advice so far. I have now moved to using rayDISC in package corHMM to reconstruct marginal maximum likelihood ancestral state reconstructions, using the method of Maddison et al (2007) and FitzJohn et al (2009) to fix the prior probabilities at the root (setting it to the observed frequency at the tips doesn't change much). The code I have is library(ape) library(corHMM) tree=read.tree(http://www.kuleuven.be/bio/ento/temp/tree.tre;) data=read.csv(file=http://www.kuleuven.be/bio/ento/temp/data.csv;) rownames(data)=data[,1] ASR=rayDISC(tree,data,ntraits=1,charnum=1,model=ER,node.states=marginal,root.p=maddfitz) plot(tree, cex=0.6, show.tip.label=TRUE, ljoin=2,lend=2,label.offset=0.02) nodelabels(pie=ASR$states,piecol=c(white,red), cex=0.45) tiplabels(pch = 22, bg = ifelse(data[tree$tip.label, ][,2],red,white), col=black,adj = c(0.51, 0.5), cex = 0.6) I still get unusually low marginal ML values for the trait being 1 at the basal nodes though (ca. 0.7, which is very low considering that 89% of my species have the trait). Would anyone be able to offer advice on why one could get the reconstructed root ML value to be so much lower than the actual observed frequency of the trait at the tips, and what could be a solution to obtaining a more realistic ML reconstruction? (I also tried diversitree and phangorn, but they all give similar results) Cheers, Tom -Original Message- From: Jack Viljoen [mailto:javilj...@gmail.com] Sent: 30 July 2013 10:23 To: Tom Wenseleers Subject: Re: [R-sig-phylo] Question on ace ML reconstruction of discrete binary character Hello, Tom. I was just wondering if the higher uncertainty at the basal nodes isn't expected, particularly given the long branches descended from them? Since this is an ML estimate and not a Bayesian one, surely the concept of priors does not apply? My understanding is that ace() actually only estimates the root node and that other methods are required to properly estimate the states at the other nodes. I'm basing this on these posts from Liam Revell earlier this year: http://blog.phytools.org/2013/03/conditional-scaled-likelihoods-in-ace.html http://blog.phytools.org