Re: [R-sig-phylo] seemingly conflicting output from BAMM

Krzysztof Kozak Tue, 19 Aug 2014 05:15:14 -0700

Dear Dan and Jonathan,

Thank you for the helpful comments. This does make a lot of sense - I
suppose I have spent too long trying to find a single shift!


Best,

__
Krzysztof "Chris" Kozak
PhD Candidate, Department of Zoology
University of Cambridge, CB2 3EJ
http://heliconius.zoo.cam.ac.uk/people/krzysztof-kozak/


On Fri, Aug 15, 2014 at 2:08 AM, Dan Rabosky <[email protected]> wrote:

>
> Hi Chris-
>
> Just to add to what Jonathan wrote.
>
> This is a good question. You have two basic issues that are being
> confounded: (1) how much evidence is there for a rate shift overall, versus
> (2) how much evidence do you have bearing on the locations of specific
> shifts. In your case, you have limited evidence for rate variation: Bayes
> factors of 3 -5 versus a model with 0 shifts. That's rather weak evidence
> for rate variation, but it's (in my opinion) at least worth considering
> further. You can also see that this is not especially strong evidence from
> considering the posterior distribution of shifts, which gives a posterior
> probability of 0.18 to a model with 0 shifts (as an aside, Bayes factors
> are always going to be more reliable for these types of comparisons because
> they explicitly take into consideration whatever prior distribution you've
> specified on the number of shifts).
>
> In your case, you have weak overall evidence for a shift somewhere in your
> data. However, you have even less evidence that a shift occurs at any
> particular location. This is what you see in your credible shift set: each
> shift configuration has a shift in a different location. The overall "best"
> shift configuration, e.g., the one with the maximum a posteriori (MAP)
> probability, does not have any "core" shifts. This is a bit confusing, but
> is explained at length here: http://bamm-project.org/rateshifts.html
>
> Basically, we only consider "significant" shifts when we enumerate the set
> of distinct shift configurations in the posterior, where "significant"
> means that a shift occurs on a given branch with substantially elevated
> frequency relative to what you'd expect under the prior (again, this is all
> explained on the documentation page). These are termed "core shifts" in
> BAMM terminology. In your case, the shift configuration with the highest
> posterior probability actually has no core shifts.
>
> However, you can decrease the threshold used to identify core shifts. You
> can lower the Bayes factor criterion used to identify shifts with elevated
> posterior probabilities relative to the prior expectation in the
> credibleShiftSet function by changing the default value for the BFcriterion
> argument (e.g., if you set BFcriterion = 1, you will probably see a shift
> in your MAP probability shift configuration).
>
> However, tweaking the BFcriterion argument won't change the fact that your
> dataset has only weak evidence overall for rate heterogeneity among clades.
> As you decrease the BFcriterion, you will find that the posterior
> probability of your MAP shift configuration will also drop. Jonathan makes
> a good point, especially relevant in this case, because the credible shift
> set here is telling you something important that you won't get out of a
> single point estimate: you don't have much confidence at all in any
> particular rate shift.
>
> ~Dan Rabosky
>
>
>
>
> On Aug 14, 2014, at 8:28 PM, Jonathan Chang wrote:
>
> Hi Krzysztof,
>
> It certainly can be true that the most credible shift configuration is
> one where there are no inferred rate shifts, but also prefer a model
> with one rate shift. The issue is mentioned in the BAMM documentation
> <http://bamm-project.org/rateshifts.html>
>
> BAMM looks like it found evidence of a rate shift on your phylogeny.
> However, the exact location of that rate shift is not certain. In your
> plots, shift configuration #2 shows a rate increase on the upper
> clade, whereas #3 shows a rate decrease in the lower clade. #4 and #5
> tell similar stories. Note that the configurations with 1 rate shift
> (#2-#5) combined are seen more often than configurations with 0 rate
> shifts (#1).
>
> Personally I'm unclear on how useful the most credible shift
> configuration actually is. To me that throws away a lot of the power
> of BAMM by reducing its inference down to a point estimate.
>
> Jonathan
>
> On Thu, Aug 14, 2014 at 5:08 PM, Krzysztof Kozak <[email protected]> wrote:
>
> Dear All,
>
>
> I have been asked to analyse my chronogram using BAMM, and I like the idea.
>
> Sadly, I am puzzled by the output. I worked through the example and read
> the
>
> entire documentation, but still don't grasp why different analyses suggest
>
> different answers.
>
>
> 1. On one hand, several functions suggest that there are 1-2 rate shifts in
>
> my data.
>
> - Plotting netdiv rate shows it changing somewhat at two times.
>
> - plot.bammdata(edata) shows increased rate on the branch leading to a
>
> disproportionately large clade
>
> - rescaling the branch lengths by the Bayes Factor of a rate shift
>
> (bayesFactorBranches) also shows that branches leading to more speciose
>
> clades are very long
>
> - computeBayesFactors gives this output:
>
> 0 1.0000000 0.2860509 0.2273844 0.3127353 0.3841264 1.091439 0.3605840
>
> 1 3.4958818 1.0000000 0.7949089 1.0932856 1.3428605 3.815542 1.2605592
>
> 2 4.3978396 1.2580058 1.0000000 1.3753596 1.6893262 4.799974 1.5857908
>
> 3 3.1975924 0.9146741 0.7270825 1.0000000 1.2282796 3.489977 1.1530008
>
> 4 2.6033098 0.7446790 0.5919520 0.8141469 1.0000000 2.841354 0.9387120
>
> 5 0.9162216 0.2620860 0.2083345 0.2865348 0.3519449 1.000000 0.3303749
>
> 7 2.7732786 0.7932987 0.6306002 0.8673021 1.0652895 3.026865 1.0000000
>
>
> - simple summary of the posterior summary(edata) also favours models with
>
> shifts
>
> Shift posterior distribution:
>
>         0     0.1800
>
>         1     0.4300
>
>         2     0.2800
>
>         3     0.0840
>
>         4     0.0240
>
>         5     0.0025
>
>         7     0.0005
>
>
> 2. On the other hand, the plot of Credible Shift Sets always shows the
> model
>
> with no shifts as most frequent (??? - an example is attached).
>
> - ...and the best shift configuration is indeed without shifts, as checked
>
> with
>
> priorshifts <- getBranchShiftPriors(tree, prior)
>
> best <- getBestShiftConfiguration(edata, prior, BFcriterion  = 5)
>
>
> To summarise: I do not understand how it is possible to find substantial
>
> Bayes Factors in support of a model with two rate shifts, and yet have the
>
> model without shifts as the "best configuration".
>
> I hope this is not too naive and I will appreciate any feedback.
>
>
> Best,
>
> __
>
> Krzysztof "Chris" Kozak
>
> PhD Candidate, Department of Zoology
>
> University of Cambridge, CB2 3EJ
>
> http://heliconius.zoo.cam.ac.uk/people/krzysztof-kozak/
>
>
> _______________________________________________
> R-sig-phylo mailing list - [email protected]
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/[email protected]/
>
>
> _____________________
> Dan Rabosky
> Assistant Professor & Curator of Herpetology
> Museum of Zoology &
> Department of Ecology and Evolutionary Biology
> University of Michigan
> Ann Arbor, MI 48109-1079 USA
>
> [email protected]
> http://www-personal.umich.edu/~drabosky
> http://www.lsa.umich.edu/ummz/
>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/

Re: [R-sig-phylo] seemingly conflicting output from BAMM

Reply via email to