Re: [R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function

2018-10-27 Thread Emmanuel Paradis

Hi David,

phangorn has the function consensusNet that builds consensus networks 
(objects of class "networx") which is more appropriate than 
ape::consensus is p < 0.5. There is a nice plotting function using RGL 
for "networx" objects. Of course, you won't get a fully bifurcating tree 
if there are incompatible splits in the tree sample. phangorn has also 
the function lento() to assess how many incompatible splits there are so 
you can test whether a specific consensus tree is likely to have 
reticulations.


I can remind the other tools available in ape for analyzing splits and 
that could help you sort out the splits:


prop.part(TR) returns the list of all splits observed in the list of 
trees TR and their (absolute) frequencies.


bitsplits is an alternative to prop.part which can be faster depending 
on the situation (this one works only with unrooted trees).


prop.clades(tr, TR) returns the frequencies of the splits present in the 
tree tr observed in TR (e.g., to get support values if TR is a list of 
bootstrap trees). It has an option whether the trees should be 
considered rooted or not.


countBipartitions(tr, TR) is similar to the previous but working with 
the "bitsplits" class.


Cheers,

Emmanuel

Le 28/10/2018 à 00:37, David Bapst a écrit :

Hi all,

I was interested if anyone was familiar with R code that can estimate an
extended majority consensus tree (referred to as an 'allcompat' tree by the
sumt command in MrBayes)? This is a fully bifurcating summary of a tree
posterior, where each clade is maximally resolved by the split that is most
abundant in the considered post-burn-in posterior (i.e., that split which
has the plurality, if not the majority - the highest posterior probability
of any other competing, conflicting splits recovered within the posterior.
So, I guess one could also call these plurality consensus trees...).

The ape function `consensus` seemed promising at first, as it takes a `p`
argument which at 1 returns the strict consensus (the default), and at 0.5
returns the majority rule consensus (effectively the same as the
'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I
set `p` below 0.5 - you could imagine that the extended majority consensus
is basically a similar threshold algorithm, but accepting solutions
(splits) of any frequency of occurrence in the tree set, so effectively
p~0.

Unfortunately, that is not how that works out, as `consensus` simply
assembles all splits with a frequency above the `p` value, but doesn't
discard conflicting splits. This means you can theoretically get more
resolved consensus trees below 0.5, but in practical terms your ability to
recover reasonable tree objects lasts until the frequency drops to the
point that you begin to accept conflicting splits.

Here's some code based off a phangorn example where I can do consensus to
get a more resolved tree as I delve into lower `p` values - you can see I
get a reasonable (if you think the extended majority rule ), more resolved
tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted,
such that the tree output no longer has a rational tree structure.

```

library(ape)
library(phangorn)
data(Laurasiatherian)
set.seed(42)
bs <- bootstrap.phyDat(Laurasiatherian, FUN =

function(x)upgma(dist.hamming(x)),
+ bs=100)


tA <- consensus(bs,p=1)
tB <- consensus(bs, p=0.5)
tC <- consensus(bs, p=0.45)
tD <- consensus(bs, p=0.2)

layout(matrix(1:4,2,2))
plot(tA);plot(tB);plot(tC);plot(tD)

Error in plot.phylo(tD) :
   tree badly conformed; cannot plot. Check the edge matrix.

str(tD)

List of 3
  $ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ...
  $ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ...
  $ Nnode: int 63
  - attr(*, "class")= chr "phylo"
Warning messages:
1: display list redraw incomplete
2: display list redraw incomplete
3: display list redraw incomplete
4: display list redraw incomplete
```
I'd be interested to know if anyone knows of an alternative way to do this
in R, or if perhaps I need to figure out how to modify `consensus` to
reject conflicting splits.

Cheers,
-Dave B

PS: Yes, I know there are real issues with such exhaustive consensus trees,
particularly they will likely agglomerate a combination of splits that
exists on no tree recovered within the posterior, but I have my reasons!



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function

2018-10-27 Thread Joe Felsenstein
David Bapst asked:

>
> I was interested if anyone was familiar with R code that can estimate an
> extended majority consensus tree (referred to as an 'allcompat' tree by the
> sumt command in MrBayes)? This is a fully bifurcating summary of a tree
> posterior, where each clade is maximally resolved by the split that is most
> abundant in the considered post-burn-in posterior (i.e., that split which
> has the plurality, if not the majority - the highest posterior probability
> of any other competing, conflicting splits recovered within the posterior.
> So, I guess one could also call these plurality consensus trees...).

You can also get the Extended Majority Rule consensus tree from the
Rconsense function in Liam Revell's package Rphylip, which is calls
programs from my PHYLIP package.  Consense in PHYLIP does output, in
addition to the consensus tree itself, a list of partitions found in
the input trees, and the frequency of each.  Rconsense may be able to
do that too.

Speaking as the one who defined the EMR method, I need to make a
warning:  EMR makes a tree by ordering the partitions (splits) in
order of their frequency.  To make Margush and McMorris's Majority
Rule consensus tree one simply goes down this list and takes all the
partitions that occur more than 50% of the time.  EMR continues
further, in order to resolve the tree fully.  It keep accepting
partitions as long as they don't conflict with anything already
accepted.

But this means that two partitions could be found (say) 45% of the
time each.   They could both be compatible with the partitions in the
majority-rule tree, but in conflict with each other.  Which then gets
included then depends only on which one is encountered first as one
goes down the list of most-frequent partitions.  And that will just be
a matter of things like the order in which the tree containing them
occurs among the input trees.  That is one of the limitations of the
EMR method.

Note also that EMR is subtly different from finding the largest set of
split (partitions) that are all mutually compatible.  That will often
be the same, but not always.  The latter is called the Nelson
Consensus Tree.

Joe

Joe Felsenstein j...@gs.washington.edu
 Department of Genome Sciences and Department of Biology,
 University of Washington, Box 355065, Seattle, WA 98195-5065 USA

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] Extended Majority Consensus Topology (Allcompat Summary) in R? And some observations on ape's consensus() function

2018-10-27 Thread David Bapst
Hi all,

I was interested if anyone was familiar with R code that can estimate an
extended majority consensus tree (referred to as an 'allcompat' tree by the
sumt command in MrBayes)? This is a fully bifurcating summary of a tree
posterior, where each clade is maximally resolved by the split that is most
abundant in the considered post-burn-in posterior (i.e., that split which
has the plurality, if not the majority - the highest posterior probability
of any other competing, conflicting splits recovered within the posterior.
So, I guess one could also call these plurality consensus trees...).

The ape function `consensus` seemed promising at first, as it takes a `p`
argument which at 1 returns the strict consensus (the default), and at 0.5
returns the majority rule consensus (effectively the same as the
'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I
set `p` below 0.5 - you could imagine that the extended majority consensus
is basically a similar threshold algorithm, but accepting solutions
(splits) of any frequency of occurrence in the tree set, so effectively
p~0.

Unfortunately, that is not how that works out, as `consensus` simply
assembles all splits with a frequency above the `p` value, but doesn't
discard conflicting splits. This means you can theoretically get more
resolved consensus trees below 0.5, but in practical terms your ability to
recover reasonable tree objects lasts until the frequency drops to the
point that you begin to accept conflicting splits.

Here's some code based off a phangorn example where I can do consensus to
get a more resolved tree as I delve into lower `p` values - you can see I
get a reasonable (if you think the extended majority rule ), more resolved
tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted,
such that the tree output no longer has a rational tree structure.

```
> library(ape)
> library(phangorn)
> data(Laurasiatherian)
> set.seed(42)
> bs <- bootstrap.phyDat(Laurasiatherian, FUN =
function(x)upgma(dist.hamming(x)),
+ bs=100)
>
> tA <- consensus(bs,p=1)
> tB <- consensus(bs, p=0.5)
> tC <- consensus(bs, p=0.45)
> tD <- consensus(bs, p=0.2)
>
> layout(matrix(1:4,2,2))
> plot(tA);plot(tB);plot(tC);plot(tD)
Error in plot.phylo(tD) :
  tree badly conformed; cannot plot. Check the edge matrix.
> str(tD)
List of 3
 $ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ...
 $ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ...
 $ Nnode: int 63
 - attr(*, "class")= chr "phylo"
Warning messages:
1: display list redraw incomplete
2: display list redraw incomplete
3: display list redraw incomplete
4: display list redraw incomplete
```
I'd be interested to know if anyone knows of an alternative way to do this
in R, or if perhaps I need to figure out how to modify `consensus` to
reject conflicting splits.

Cheers,
-Dave B

PS: Yes, I know there are real issues with such exhaustive consensus trees,
particularly they will likely agglomerate a combination of splits that
exists on no tree recovered within the posterior, but I have my reasons!

-- 
David W. Bapst, PhD
Asst Research Professor, Geology & Geophysics, Texas A & M University
Postdoc, Ecology & Evolutionary Biology, Univ of Tenn Knoxville
https://github.com/dwbapst/paleotree

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/