Hi all, here is a small function which it is very similar to David's 2nd script, but it is a bit more compact: library(phangorn) # requires the newest pangorn version >= 1.99-8 addConfidences.phylo <- function(to, from){ conf = attr(addConfidences(as.splits(to), from), "confidences") nTips = length(to$tip.label) to$node.label = conf[-c(1:nTips)] to } treeA <- addConfidences.phylo(treeA, treeB)
I will fix also my midpoint rooting function. And some comments on possible problems / features of the 2nd script: The function assumes that the ordering of the tip labels is identical with the effect "t3" "t4" is different to "t4" "t3" A lapply(splitsA, sort) and lapply(splitsB, sort) should do the trick. Now the feature: let assume there a 5 taxa "t1" "t2" "t3" "t4" "t5" A clade containing "t1" "t2" will be different from "t3" "t4" "t5". If you want just to reroot a tree, you probably do not want to distinguish clades but only splits. Regards, Klaus On Wed, Oct 8, 2014 at 10:51 AM, David Bapst <dwba...@gmail.com> wrote: > Hello all, > > Unable to get phyloch's read.mrbayes() to read my .con.tre files (for > reasons unknown), I've gone with Graham's suggestion to use the simple > output format from sumt. However, as I have a large number of analyses > which took considerable computational time (and I am stubborn to learn > how to run these on a cluster) I decided the more economical thing to > do was to write an R script that took the existing nexus files and > wrote a nexus file with MrBayes blocks that would output 'simple' > format consensus trees. I realize such string manipulation might be > easier in some other language, but I know R best. > > Here it is, in case anyone else ever needs such a thing: > > _________________________________________________ > > #change to dir for MB files > setwd("C:/dave/workspace/mrbayes") > > #get files > files<-list.files() > #get tree files > treeFiles<-files[grep(x=files,pattern="run..t")] > #get the nexus files > inputs<-sort(unique(sub(".run..t","",treeFiles))) > inputs<-sort(unique(sub(".tree.","",inputs))) > > #now start constructing sumt lines > bayesBlock<-"#NEXUS" > for(i in 1:length(inputs)){ > > origNex<-scan(inputs[i],what="character",sep="\n",blank.lines.skip=FALSE) > cropNex<-origNex[2:grep(origNex,pattern="mcmcp")] > #remove log line > cropNex<-cropNex[-grep(cropNex,pattern="log start")] > sumtLine<-paste0("sumt filename=",inputs[i]," outputname=", > paste0(inputs[i],".simple")," conformat=simple;") > bayesBlock<-c(bayesBlock,cropNex,sumtLine,"","end;","") > } > > write(bayesBlock,file="simple_sumt.nex") > > _______________________________________________ > > Some caveats: this script assumes you already have mrbayes blocks in > your nexus files (all of which are stored with all output in the same > directory), including a line where you set MCMC parameters (mcmcp) > just prior to calling the MCMC itself and after any other > modifications to the dataset, model or prior settings. > > If you execute the nexus file created, it should run sumt and output > new output files with 'simple' in the file name. Analyses where there > are multiple trees (such as partitioned analyses with unlinked branch > lengths) will have multiple consensus trees estimated for them, as > expected if we had originally done this immediately after running > mcmc. Also note I crop out any lines that create new log files, > because I didn't want to write over my old log files. > > The resulting simple-format consensus tree files can be read into ape > with read.nexus just fine. For whatever reason, these nexus files > contain two trees, otherwise identical other than that one contains > the posterior probabilities as its node-labels. > > ....However! There is a big however here. > > If we re-root the tree in ape (and there are many reasons why we might > want to do so), the posterior probabilities, which are stored as node > labels, no longer make sense as those node labels are not properties > technically of the nodes but of the edges preceding those nodes. > Change the directionality of the graph by rerooting and those > posterior probabilities end up in the wrong place. (TreeFig apparently > figures this out on its own, automatically, if your reroot in > TreeFig.) > > I haven't found any option in an R package that slides node-labels > around as if they were support values when the tree is rerooted (and > if there is one, please let me know!) so I had to write my own thing > for doing that to the newly created 'simple' consensus tree files... > > So here's another script. > > _____________________________________________ > > #change dir > setwd("C:/dave/workspace/mrbayes") > > library(ape) > > files<-list.files() > #get consensus nexus files (ugh regexp) > conFiles<-files[grep(x=files,pattern=".con.tre")] > #get only 'simple' files > conFiles<-conFiles[grep(x=conFiles,pattern="simple")] > conName<-sub(".nex.simple","",conFiles) > conName<-sub(".con.tre","",conName) > > trees<-list() > for(i in 1:length(conFiles)){ > treesRead<-read.nexus(conFiles[i]) > treeA<-treesRead[[2]] > treeB<-treesRead[[1]] > #now we're going to reroot and let's use fake species names hahah > > treeA<-root(treeA,resolve.root=TRUE,outgroup=c("OutOfTheGroup","OutThere", > "outgroupy","groupout")) > #okay basically need to figure out > #which taxa are included by a node > #and match between treeB and tree A > splitsB<-lapply(prop.part(treeB),function(x) > if(any(x==1)){treeB$tip.label[x] > }else{treeB$tip.label[-x]}) > splitsA<-lapply(prop.part(treeA),function(x) > if(any(x==1)){treeA$tip.label[x] > }else{treeA$tip.label[-x]}) > #plot(treeB,show.node.label=TRUE) > treeA$node.label<-treeB$node.label[match(splitsA,splitsB)] > trees[[i]]<-treeA > } > class(trees)<-"multiPhylo" > > #plot(trees) > > pdf("consensus_trees_10-07-14.pdf",height=8,width=8) > for(i in 1:length(trees)){ > plot(trees[[i]],main=conName[i],show.node.label=TRUE) > add.scale.bar() > } > dev.off() > > _____________________________________________________ > > The last bit just plots it in a pdf, which isn't important at all but > I included it for the heck of it. The important stuff is the innards > of the first for loop, which reads in the 'simple' consensus tree > nexus file, reroots it and then rearranges the posterior probabilities > to match the same splits across both trees. I've checked this against > the read out logged by MrBayes and the result of rerooted trees in > FigTree, and the resulting posterior probabilities match those shown > across all three programs. > > Anyway, I hope this helps the next person trying to figure out how to > plot posterior probabilites from MrBayes in R! > > Cheers, > -Dave > > > > On Fri, Oct 3, 2014 at 10:29 AM, David Bapst <dwba...@gmail.com> wrote: > > Just to clarify, off-list discussion with Graham reveals (after some > > confusion on my part) that if the simple format option is used in sumt > > in MrBayes, then ape's read.nexus will natively read the posterior > > probability values as node-labels. Which is interesting and I had not > > come across that information previously. > > > > Graham's comment is not in reference to the functioning of > read.annotated.nexus. > > > > Cheers, > > -Dave > > > > > > On Fri, Oct 3, 2014 at 8:49 AM, Slater, Graham <slat...@si.edu> wrote: > >> Dave, > >> > >> I've had the same trouble with shuffling. However, all of this can be > >> avoided if you specify the simple format for your .con file in the > mrBayes > >> block. > >> > >> sumt conformat = simple; > >> > >> The resulting tree will correctly display posterior probabilities in a > >> phyloformat tree. > >> > >> Graham > >> > >> > >> > >> ------------------------------------------------------------ > >> Graham Slater > >> Peter Buck Post-Doctoral Fellow > >> Department of Paleobiology > >> National Museum of Natural History > >> The Smithsonian Institution [NHB, MRC 121] > >> P.O. Box 37012 > >> > >> > >> (202) 633-1316 > >> slat...@si.edu > >> www.fourdimensionalbiology.com > >> > >> > >> > >> > >> > >> On Oct 3, 2014, at 10:42 AM, David Bapst <dwba...@gmail.com> wrote: > >> > >> Hello all, > >> > >> Recently, I wanted to display posterior probabilities on a 50% > >> compatibility tree from a MrBayes run, created with the 'sumt' > >> command. I looked around for ways to do this and found this email > >> thread from last year: > >> > >> https://stat.ethz.ch/pipermail/r-sig-phylo/2013-June/002825.html > >> > >> ...which suggests using read.annotated.nexus() in package epibase, > >> which has been renamed OutbreakTools more recently. > >> > >> Unfortunately, it appears read.annotated.nexus in OutbreakTools > >> (v0.1-11, in R 3.1.1) improperly creates the new edge matrix, > >> resulting in an apparent shuffling of tip labels. As this function was > >> discussed on R-Sig-Phylo, I am sending this bug report to both the > >> list and to the package maintainer (Thibaut J.). > >> > >> I don't have any BEAST output files, so I cannot test if this occurs > >> with files that are not created by MrBayes. > >> > >> To show this phenomen, I have created an example .con.tre file using > >> an example matrix of standard characters for several elephants that I > >> stole off of Joe F.'s website. You can find the .con.tre file here on > >> gdrive: > >> > >> > https://drive.google.com/file/d/0B_xvEcEvKno_LTFhSVJrdHMtNFU/view?usp=sharing > >> > >> And now in R, a comparison with a tree: > >> > >> library(OutbreakTools) > >> library(ape) > >> > >> tree1<-read.nexus("mat.nex.con.tre") > >> tree2<-read.annotated.nexus("mat.nex.con.tre") > >> > >> layout(1:2) > >> plot(tree1,no.margin=TRUE) > >> plot(tree2,no.margin=TRUE) > >> > >> identical(tree1$tip.label,tree2$tip.label) > >> identical(tree1$edge.length,tree2$edge.length) > >> identical(tree1$edge,tree2$edge) > >> > >> Inspection of the tree file with FigTree suggests that the ape > >> function is producing the correct tree and the OutbreakTools function > >> is not. We can see the why with last three lines in the console: > >> > >> identical(tree1$tip.label,tree2$tip.label) > >> > >> [1] TRUE > >> > >> identical(tree1$edge.length,tree2$edge.length) > >> > >> [1] TRUE > >> > >> identical(tree1$edge,tree2$edge) > >> > >> [1] FALSE > >> > >> Closer inspection suggests that the issue appears to be that some > >> terminal edges are inappropriately shuffled, thus shuffling the > >> terminal tip labels on those edges, while leaving the tip.label order > >> and the edge lengths the same. Also, larger trees seem to produce > >> larger inconsistencies in the tree produced. > >> > >> So... now for the list, are there any other methods on CRAN for > >> obtaining the posterior probabilities from a .con.tre file? Otherwise, > >> next I guess I'll be trying phyloch. > >> > >> Cheers, > >> -Dave > >> > >> -- > >> David W. Bapst, PhD > >> Adjunct Asst. Professor, Geology and Geol. Eng. > >> South Dakota School of Mines and Technology > >> 501 E. St. Joseph > >> Rapid City, SD 57701 > >> > >> http://webpages.sdsmt.edu/~dbapst/ > >> http://cran.r-project.org/web/packages/paleotree/index.html > >> > >> _______________________________________________ > >> R-sig-phylo mailing list - R-sig-phylo@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > >> Searchable archive at > http://www.mail-archive.com/r-sig-phylo@r-project.org/ > >> > >> > > > > > > > > -- > > David W. Bapst, PhD > > Adjunct Asst. Professor, Geology and Geol. Eng. > > South Dakota School of Mines and Technology > > 501 E. St. Joseph > > Rapid City, SD 57701 > > > > http://webpages.sdsmt.edu/~dbapst/ > > http://cran.r-project.org/web/packages/paleotree/index.html > > > > -- > David W. Bapst, PhD > Adjunct Asst. Professor, Geology and Geol. Eng. > South Dakota School of Mines and Technology > 501 E. St. Joseph > Rapid City, SD 57701 > > http://webpages.sdsmt.edu/~dbapst/ > http://cran.r-project.org/web/packages/paleotree/index.html > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at > http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Klaus Schliep Postdoctoral Fellow Revell Lab, University of Massachusetts Boston [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/