Dear Dan and Jonathan, Thank you for the helpful comments. This does make a lot of sense - I suppose I have spent too long trying to find a single shift!
Best, __ Krzysztof "Chris" Kozak PhD Candidate, Department of Zoology University of Cambridge, CB2 3EJ http://heliconius.zoo.cam.ac.uk/people/krzysztof-kozak/ On Fri, Aug 15, 2014 at 2:08 AM, Dan Rabosky <[email protected]> wrote: > > Hi Chris- > > Just to add to what Jonathan wrote. > > This is a good question. You have two basic issues that are being > confounded: (1) how much evidence is there for a rate shift overall, versus > (2) how much evidence do you have bearing on the locations of specific > shifts. In your case, you have limited evidence for rate variation: Bayes > factors of 3 -5 versus a model with 0 shifts. That's rather weak evidence > for rate variation, but it's (in my opinion) at least worth considering > further. You can also see that this is not especially strong evidence from > considering the posterior distribution of shifts, which gives a posterior > probability of 0.18 to a model with 0 shifts (as an aside, Bayes factors > are always going to be more reliable for these types of comparisons because > they explicitly take into consideration whatever prior distribution you've > specified on the number of shifts). > > In your case, you have weak overall evidence for a shift somewhere in your > data. However, you have even less evidence that a shift occurs at any > particular location. This is what you see in your credible shift set: each > shift configuration has a shift in a different location. The overall "best" > shift configuration, e.g., the one with the maximum a posteriori (MAP) > probability, does not have any "core" shifts. This is a bit confusing, but > is explained at length here: http://bamm-project.org/rateshifts.html > > Basically, we only consider "significant" shifts when we enumerate the set > of distinct shift configurations in the posterior, where "significant" > means that a shift occurs on a given branch with substantially elevated > frequency relative to what you'd expect under the prior (again, this is all > explained on the documentation page). These are termed "core shifts" in > BAMM terminology. In your case, the shift configuration with the highest > posterior probability actually has no core shifts. > > However, you can decrease the threshold used to identify core shifts. You > can lower the Bayes factor criterion used to identify shifts with elevated > posterior probabilities relative to the prior expectation in the > credibleShiftSet function by changing the default value for the BFcriterion > argument (e.g., if you set BFcriterion = 1, you will probably see a shift > in your MAP probability shift configuration). > > However, tweaking the BFcriterion argument won't change the fact that your > dataset has only weak evidence overall for rate heterogeneity among clades. > As you decrease the BFcriterion, you will find that the posterior > probability of your MAP shift configuration will also drop. Jonathan makes > a good point, especially relevant in this case, because the credible shift > set here is telling you something important that you won't get out of a > single point estimate: you don't have much confidence at all in any > particular rate shift. > > ~Dan Rabosky > > > > > On Aug 14, 2014, at 8:28 PM, Jonathan Chang wrote: > > Hi Krzysztof, > > It certainly can be true that the most credible shift configuration is > one where there are no inferred rate shifts, but also prefer a model > with one rate shift. The issue is mentioned in the BAMM documentation > <http://bamm-project.org/rateshifts.html> > > BAMM looks like it found evidence of a rate shift on your phylogeny. > However, the exact location of that rate shift is not certain. In your > plots, shift configuration #2 shows a rate increase on the upper > clade, whereas #3 shows a rate decrease in the lower clade. #4 and #5 > tell similar stories. Note that the configurations with 1 rate shift > (#2-#5) combined are seen more often than configurations with 0 rate > shifts (#1). > > Personally I'm unclear on how useful the most credible shift > configuration actually is. To me that throws away a lot of the power > of BAMM by reducing its inference down to a point estimate. > > Jonathan > > On Thu, Aug 14, 2014 at 5:08 PM, Krzysztof Kozak <[email protected]> wrote: > > Dear All, > > > I have been asked to analyse my chronogram using BAMM, and I like the idea. > > Sadly, I am puzzled by the output. I worked through the example and read > the > > entire documentation, but still don't grasp why different analyses suggest > > different answers. > > > 1. On one hand, several functions suggest that there are 1-2 rate shifts in > > my data. > > - Plotting netdiv rate shows it changing somewhat at two times. > > - plot.bammdata(edata) shows increased rate on the branch leading to a > > disproportionately large clade > > - rescaling the branch lengths by the Bayes Factor of a rate shift > > (bayesFactorBranches) also shows that branches leading to more speciose > > clades are very long > > - computeBayesFactors gives this output: > > 0 1.0000000 0.2860509 0.2273844 0.3127353 0.3841264 1.091439 0.3605840 > > 1 3.4958818 1.0000000 0.7949089 1.0932856 1.3428605 3.815542 1.2605592 > > 2 4.3978396 1.2580058 1.0000000 1.3753596 1.6893262 4.799974 1.5857908 > > 3 3.1975924 0.9146741 0.7270825 1.0000000 1.2282796 3.489977 1.1530008 > > 4 2.6033098 0.7446790 0.5919520 0.8141469 1.0000000 2.841354 0.9387120 > > 5 0.9162216 0.2620860 0.2083345 0.2865348 0.3519449 1.000000 0.3303749 > > 7 2.7732786 0.7932987 0.6306002 0.8673021 1.0652895 3.026865 1.0000000 > > > - simple summary of the posterior summary(edata) also favours models with > > shifts > > Shift posterior distribution: > > 0 0.1800 > > 1 0.4300 > > 2 0.2800 > > 3 0.0840 > > 4 0.0240 > > 5 0.0025 > > 7 0.0005 > > > 2. On the other hand, the plot of Credible Shift Sets always shows the > model > > with no shifts as most frequent (??? - an example is attached). > > - ...and the best shift configuration is indeed without shifts, as checked > > with > > priorshifts <- getBranchShiftPriors(tree, prior) > > best <- getBestShiftConfiguration(edata, prior, BFcriterion = 5) > > > To summarise: I do not understand how it is possible to find substantial > > Bayes Factors in support of a model with two rate shifts, and yet have the > > model without shifts as the "best configuration". > > I hope this is not too naive and I will appreciate any feedback. > > > Best, > > __ > > Krzysztof "Chris" Kozak > > PhD Candidate, Department of Zoology > > University of Cambridge, CB2 3EJ > > http://heliconius.zoo.cam.ac.uk/people/krzysztof-kozak/ > > > _______________________________________________ > R-sig-phylo mailing list - [email protected] > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at > http://www.mail-archive.com/[email protected]/ > > > _____________________ > Dan Rabosky > Assistant Professor & Curator of Herpetology > Museum of Zoology & > Department of Ecology and Evolutionary Biology > University of Michigan > Ann Arbor, MI 48109-1079 USA > > [email protected] > http://www-personal.umich.edu/~drabosky > http://www.lsa.umich.edu/ummz/ > > > [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/[email protected]/
