Re: [R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function
Hi David, phangorn has the function consensusNet that builds consensus networks (objects of class "networx") which is more appropriate than ape::consensus is p < 0.5. There is a nice plotting function using RGL for "networx" objects. Of course, you won't get a fully bifurcating tree if there are incompatible splits in the tree sample. phangorn has also the function lento() to assess how many incompatible splits there are so you can test whether a specific consensus tree is likely to have reticulations. I can remind the other tools available in ape for analyzing splits and that could help you sort out the splits: prop.part(TR) returns the list of all splits observed in the list of trees TR and their (absolute) frequencies. bitsplits is an alternative to prop.part which can be faster depending on the situation (this one works only with unrooted trees). prop.clades(tr, TR) returns the frequencies of the splits present in the tree tr observed in TR (e.g., to get support values if TR is a list of bootstrap trees). It has an option whether the trees should be considered rooted or not. countBipartitions(tr, TR) is similar to the previous but working with the "bitsplits" class. Cheers, Emmanuel Le 28/10/2018 à 00:37, David Bapst a écrit : Hi all, I was interested if anyone was familiar with R code that can estimate an extended majority consensus tree (referred to as an 'allcompat' tree by the sumt command in MrBayes)? This is a fully bifurcating summary of a tree posterior, where each clade is maximally resolved by the split that is most abundant in the considered post-burn-in posterior (i.e., that split which has the plurality, if not the majority - the highest posterior probability of any other competing, conflicting splits recovered within the posterior. So, I guess one could also call these plurality consensus trees...). The ape function `consensus` seemed promising at first, as it takes a `p` argument which at 1 returns the strict consensus (the default), and at 0.5 returns the majority rule consensus (effectively the same as the 'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I set `p` below 0.5 - you could imagine that the extended majority consensus is basically a similar threshold algorithm, but accepting solutions (splits) of any frequency of occurrence in the tree set, so effectively p~0. Unfortunately, that is not how that works out, as `consensus` simply assembles all splits with a frequency above the `p` value, but doesn't discard conflicting splits. This means you can theoretically get more resolved consensus trees below 0.5, but in practical terms your ability to recover reasonable tree objects lasts until the frequency drops to the point that you begin to accept conflicting splits. Here's some code based off a phangorn example where I can do consensus to get a more resolved tree as I delve into lower `p` values - you can see I get a reasonable (if you think the extended majority rule ), more resolved tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted, such that the tree output no longer has a rational tree structure. ``` library(ape) library(phangorn) data(Laurasiatherian) set.seed(42) bs <- bootstrap.phyDat(Laurasiatherian, FUN = function(x)upgma(dist.hamming(x)), + bs=100) tA <- consensus(bs,p=1) tB <- consensus(bs, p=0.5) tC <- consensus(bs, p=0.45) tD <- consensus(bs, p=0.2) layout(matrix(1:4,2,2)) plot(tA);plot(tB);plot(tC);plot(tD) Error in plot.phylo(tD) : tree badly conformed; cannot plot. Check the edge matrix. str(tD) List of 3 $ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ... $ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ... $ Nnode: int 63 - attr(*, "class")= chr "phylo" Warning messages: 1: display list redraw incomplete 2: display list redraw incomplete 3: display list redraw incomplete 4: display list redraw incomplete ``` I'd be interested to know if anyone knows of an alternative way to do this in R, or if perhaps I need to figure out how to modify `consensus` to reject conflicting splits. Cheers, -Dave B PS: Yes, I know there are real issues with such exhaustive consensus trees, particularly they will likely agglomerate a combination of splits that exists on no tree recovered within the posterior, but I have my reasons! ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function
David Bapst asked: > > I was interested if anyone was familiar with R code that can estimate an > extended majority consensus tree (referred to as an 'allcompat' tree by the > sumt command in MrBayes)? This is a fully bifurcating summary of a tree > posterior, where each clade is maximally resolved by the split that is most > abundant in the considered post-burn-in posterior (i.e., that split which > has the plurality, if not the majority - the highest posterior probability > of any other competing, conflicting splits recovered within the posterior. > So, I guess one could also call these plurality consensus trees...). You can also get the Extended Majority Rule consensus tree from the Rconsense function in Liam Revell's package Rphylip, which is calls programs from my PHYLIP package. Consense in PHYLIP does output, in addition to the consensus tree itself, a list of partitions found in the input trees, and the frequency of each. Rconsense may be able to do that too. Speaking as the one who defined the EMR method, I need to make a warning: EMR makes a tree by ordering the partitions (splits) in order of their frequency. To make Margush and McMorris's Majority Rule consensus tree one simply goes down this list and takes all the partitions that occur more than 50% of the time. EMR continues further, in order to resolve the tree fully. It keep accepting partitions as long as they don't conflict with anything already accepted. But this means that two partitions could be found (say) 45% of the time each. They could both be compatible with the partitions in the majority-rule tree, but in conflict with each other. Which then gets included then depends only on which one is encountered first as one goes down the list of most-frequent partitions. And that will just be a matter of things like the order in which the tree containing them occurs among the input trees. That is one of the limitations of the EMR method. Note also that EMR is subtly different from finding the largest set of split (partitions) that are all mutually compatible. That will often be the same, but not always. The latter is called the Nelson Consensus Tree. Joe Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function
Hi all, I was interested if anyone was familiar with R code that can estimate an extended majority consensus tree (referred to as an 'allcompat' tree by the sumt command in MrBayes)? This is a fully bifurcating summary of a tree posterior, where each clade is maximally resolved by the split that is most abundant in the considered post-burn-in posterior (i.e., that split which has the plurality, if not the majority - the highest posterior probability of any other competing, conflicting splits recovered within the posterior. So, I guess one could also call these plurality consensus trees...). The ape function `consensus` seemed promising at first, as it takes a `p` argument which at 1 returns the strict consensus (the default), and at 0.5 returns the majority rule consensus (effectively the same as the 'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I set `p` below 0.5 - you could imagine that the extended majority consensus is basically a similar threshold algorithm, but accepting solutions (splits) of any frequency of occurrence in the tree set, so effectively p~0. Unfortunately, that is not how that works out, as `consensus` simply assembles all splits with a frequency above the `p` value, but doesn't discard conflicting splits. This means you can theoretically get more resolved consensus trees below 0.5, but in practical terms your ability to recover reasonable tree objects lasts until the frequency drops to the point that you begin to accept conflicting splits. Here's some code based off a phangorn example where I can do consensus to get a more resolved tree as I delve into lower `p` values - you can see I get a reasonable (if you think the extended majority rule ), more resolved tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted, such that the tree output no longer has a rational tree structure. ``` > library(ape) > library(phangorn) > data(Laurasiatherian) > set.seed(42) > bs <- bootstrap.phyDat(Laurasiatherian, FUN = function(x)upgma(dist.hamming(x)), + bs=100) > > tA <- consensus(bs,p=1) > tB <- consensus(bs, p=0.5) > tC <- consensus(bs, p=0.45) > tD <- consensus(bs, p=0.2) > > layout(matrix(1:4,2,2)) > plot(tA);plot(tB);plot(tC);plot(tD) Error in plot.phylo(tD) : tree badly conformed; cannot plot. Check the edge matrix. > str(tD) List of 3 $ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ... $ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ... $ Nnode: int 63 - attr(*, "class")= chr "phylo" Warning messages: 1: display list redraw incomplete 2: display list redraw incomplete 3: display list redraw incomplete 4: display list redraw incomplete ``` I'd be interested to know if anyone knows of an alternative way to do this in R, or if perhaps I need to figure out how to modify `consensus` to reject conflicting splits. Cheers, -Dave B PS: Yes, I know there are real issues with such exhaustive consensus trees, particularly they will likely agglomerate a combination of splits that exists on no tree recovered within the posterior, but I have my reasons! -- David W. Bapst, PhD Asst Research Professor, Geology & Geophysics, Texas A & M University Postdoc, Ecology & Evolutionary Biology, Univ of Tenn Knoxville https://github.com/dwbapst/paleotree [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/