Re: [R-sig-phylo] Different estimate of Bloomberg's k in ape and caper in R
Hi Solomon, It’s because neither of the things you are estimating are Blomberg’s K and both are different from one another. In Caper, you are estimating Pagel’s Kappa (a branch length transformation that attempts to explain evolutionary change as punctuational [1] vs gradual [=1] ), while in ape you are estimating Blomberg’s gamma (a transformation known as ACDC, similar to the early burst model, that is associated with accelerating [1] or decelerating [1] change). Graham On Jul 14, 2015, at 10:58 AM, Solomon Chak tc...@vims.edu wrote: Hi everyone, I'd like to get some opinions on why estimates of Bloomberg's k are different using the packages ape and caper in R. Please see the code below for example, in which I used the same data but got different estimates of k. In my own data, the estimated ks were completely opposite: k=0.0001 in ape, and k=3 in caper. library(ape) library(caper) library(nlme) data(shorebird) # CAPER compdata - comparative.data(shorebird.tree, shorebird.data, names.col = Species, vcv.dim=3, vcv = TRUE) caper.model - pgls(log(Egg.Mass) ~ log(M.Mass) * log(F.Mass), compdata, kappa=ML) summary(caper.model) # k=0.474 # APE gls.model = gls(log(Egg.Mass) ~ log(M.Mass) * log(F.Mass), correlation=corBlomberg(value = 0.1, phy=shorebird.tree), data=shorebird.data, method=ML) summary(gls.model) # k = 0.9079989 Cheers, Solomon Chak solomonc...@gmail.com Virginia Institute of Marine Science College of William and Mary [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Hi Daniel, There’s a difference between a method being able to handle fossil data, that is a dataset consisting of a non-ultrametric tree an data for all tips including non contemporaneous ones, and a method allowing you to directly specify trait values at nodes. Most trait evolution methods allow you to do the former (I don’t know for sure but I expect OUCH does). For the latter, which you want to do, there is a function in geiger (described in Slater, Harmon, and Alfaro 2012 Evolution), that allows you to place informative prior probability distributions on node trait values based on the fossil record. But this only allows for fitting simple models and not complex OU scenarios you might want to test. As a hack, I’d suggest adding zero length branches to all nodes in your tree and assigning your reconstructed node values to these. This will produce identical results to specifying node values directly. Zero length branches can be problematic for matrix operations required to compute likelihoods in R and so you might need to explore minimally short branch lengths (10^-5 time units has worked for me in the past). This all should have the same effect as specifying node values, but Aaron will need to confirm that it would work in OUCH. I would question though whether this is a good strategy - you’re assuming your ML estimates of node states , presumably inferred under BM, are robust enough to be fixed for subsequent macroevolutionary analyses. Given how dicey ASRs are, even when you include fossil data, this seems a big stretch. If this is a route you really want to go, perhaps explore using a restricted set of inferred node states - for example only those nodes in the extant taxon tree that are directly ancestral to a fossil taxon, to explore how much this approach influences your results. g Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 4, 2015, at 2:50 PM, Daniel Fulop dfulop@gmail.commailto:dfulop@gmail.com wrote: Isn't at least some of this functionality in mvSLOUCH and/or geiger? ...it's definitely the case that mvSLOUCH can handle missing data at the tips, and I think fossil data can be incorporated in it and geiger as well. At least Slater 2013 has code for incorporating fossils in geiger or modified geiger functions. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Oops - sorry Daniel, yes that should have been addressed to Nathan... Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 4, 2015, at 4:27 PM, Daniel Fulop dfulop@gmail.commailto:dfulop@gmail.com wrote: Thanks, Graham ...but I'm not the OP. I was just shooting off a quick lead without actually checking the specifics in case it was useful. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] clade-specific release and radiate model?
Hi Dan et al, Sorry I won�t be of much help analytically - my functions were written specifically for the time slice problem. It wouldn�t be too difficult to write a function to fit the model you describe, but it seems as though Julien has this all covered in mvMorph already, so that does seem to be the best route to go. A word of caution though. From your last email it sounds as though you�re using a tree of extant taxa. Though it is probably less of a concern for cases where you have a subclade-based shift in mode, rather than a temporal, clade-wide shift, these kinds of processes are still a) very difficult to detect using trees of extant taxa only, and b) fitting of OU processes exhibit notorious levels of type I error. It�s worth using additional tests on top of the model fitting. For example, you might use simulations under your best fitting model to see if it actually does a better job than simpler processes of replicating your data (see Pennell et al 2014 http://biorxiv.org/content/early/2014/04/07/004002.short or Slater and Pennell 2014 http://sysbio.oxfordjournals.org/content/63/3/293 and Boettiger et al. 2012 http://onlinelibrary.wiley.com/doi/10./j.1558-5646.2012.01574.x/abstract). In addition, a simple yet elegant (in my opinion) way of examining whether an OU process that constrains variance explains the background macroevolutionary regime and release from this process explains your subclade pattern would be to use a form of Brian�s censored test but wherein you test for �phylogenetic signal� separately in the paraphyletic stem and monophyletic subclade. You�d expect significantly low signal in the background and high (i.e. not significantly different) signal in the subclade. See Hopkins and Smith 2015 http://www.pnas.org/content/112/12/3758.abstract for a nice use of this kind of test. These are all nice supplementary tests that would help assuage a lot of peoples� concerns about OU processes and their tendency to fit well to any kind of noisy comparative data. g Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On May 29, 2015, at 3:44 PM, Julien Clavel julien.cla...@hotmail.frmailto:julien.cla...@hotmail.fr wrote: Hi Dan, Fitting models with two distinct modes of trait evolution is possible with the mvSHIFT function in mvMORPH. You just need to map on the tree the two clades (or selective regimes, groups... otherwise) of interest rather than two times periods. The mapping can easily be done using the make.simmap or the paintSupTree functions in phytools. The package handle both univariate and multivariate data. Best, Julien Date: Fri, 29 May 2015 08:27:31 -0700 From: dfulop@gmail.commailto:dfulop@gmail.com To: omeara.br...@gmail.commailto:omeara.br...@gmail.com CC: r-sig-phylo@r-project.orgmailto:r-sig-phylo@r-project.org; julien.cla...@hotmail.frmailto:julien.cla...@hotmail.fr; slat...@si.edumailto:slat...@si.edu Subject: Re: [R-sig-phylo] clade-specific release and radiate model? Hi Brian, Thanks for your thoughtful response! Those are very good points about identifiability and penalization of the OU mean for the released clade. The censored model is an appealing approach, especially since the timing of the release along the (stem) branch leading to the released clade is unknown, and somewhat beside the point. In the case in question several novel morphological-mechanical structures had to evolve to enable the release, and whether those evolved gradually or not along the stem branch is an open question -- perhaps unknowable from analyzing extant taxa. In thinking about this a bit more, though, I think methods for an epoch release may be able to accommodate a clade-specific scenario, at least if the timing of the mode shift is specified using SIMMAP trees and not a shift-time parameter. Slater 2013 has R code associated with it that implements release models, and there's a multivariate implementation of release (and other mode shift) models in the new mvMORPH package (which is on CRAN, but not yet published as a manuscript as far as I can tell). In skimming Slater's code it doesn't seem to use SIMMAP trees. However, mvMORPH's shift specification can be done with SIMMAP trees, and I see no reason why its mvSHIFT function would care whether the shift is clade-specific or not. I'm cc'ing Graham and Julien to see if they have something to add. Regarding the use of mvSHIFT, I don't have multivariate data; hopefully that won't be a problem. Cheers, Dan. Brian O'Meara wrote: Hi, Daniel. It's a bit arguable whether as alpha - 0, OU - BM: I think it should, but IIRC in OUCH this
Re: [R-sig-phylo] fitContinuous drift model?
Hi Milton, The drift model gives a time-dependent change in the expected, or mean value of the trait. If we denote the drift parameter as M, then the expected value of the trait, E(x), at time t is a+Mt where a is the root state of the trait. So if M is positive, your trait tends to get larger over time and if it is negative, the trait tends to get smaller. Note that this is a tendency because the trait is actually evolving under a biased random walk, so variance still increases with time, as in Brownian motion. For this reason, you wouldn’t do a branch length reconstruction to infer ancestral states under drift. Because we are modeling change in the expected value of the trait through time, the model is unidentifiable without non-comtemporaneous (i.e. fossil, time series) tips or a very strong prior / bound on the root state. This is a good reference for the model in a non-phylogenetic context http://www.psjournals.org/doi/abs/10.1666/05070.1 You can do ancestral state estimation under the drift model using fitContinuousMCMC in geiger (this also allows you to place informative priors on node values) and I think Liam has a function in phytools to give you the ML estimates. Cheers, Graham On May 28, 2015, at 4:07 PM, Milton Tan mzt0...@tigermail.auburn.edumailto:mzt0...@tigermail.auburn.edu wrote: Hello all, I have some questions about the drift model implemented in geiger to test for a trend in increasing or decreasing trait values over time. I'm curious how to interpret the drift parameter, as well as whether BM is nested within BM. Is there a citation I can read for more information? I haven't seen the parameters of the drift model described anywhere explicitly, though perhaps I have missed it. It seems similar to the test for a directional pressure implemented in Pagel 1997, but I don't see any mention of the drift parameter. Additionally, I'm curious if there's a good way to incorporate a drift model into ancestral state reconstruction? I imagine I can transform the branch lengths based on the drift parameter somehow and simply reconstruct ancestral trait states under BM, but I'm unsure how to do the branch length transformation for a drift model given that I'm not entirely sure what the drift parameter represents. Thanks all, Milton Tan Auburn University Department of Biological Sciences PhD Candidate [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.com [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Obtaining time-scaled trees from MrBayes posterior tree block
Hi Roger, Branch lengths of trees in the t files are in expected changes per site. To get these into time units, you just need to divide the branch lengths by the corresponding clock rate in the p.file. Cheers, Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Oct 27, 2014, at 2:35 PM, Roger Close roger.cl...@gmail.commailto:roger.cl...@gmail.com wrote: Hi all, I wish to run some comparative analyses on a posterior sample of trees saved by MrBayes. When I imported the .t file into R, I discovered that branch lengths weren't saved in units of time: they appear to represent some other measure of change approximately proportional to time, but about a thousand times smaller. The consensus tree produced by MrBayes, by contrast, is a correctly scaled chronogram with branch lengths in millions of years. Has anyone managed to obtain time-scaled trees from the branch lengths that are recorded by MrBayes? I suspect that it might involve dividing by the clock rate given at the start of each tree, but there must be standard procedure (e.g., in MrBayes itself). Many thanks, Roger --- Roger Close, Postdoctoral Research Associate Department of Earth Sciences, Oxford University South Parks Road Oxford OX1 3AN United Kingdom [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] MrBayes Trees are Misread by read.annotated.nexus() in OutbreakTools
Dave, I�ve had the same trouble with shuffling. However, all of this can be avoided if you specify the simple format for your .con file in the mrBayes block. sumt conformat = simple; The resulting tree will correctly display posterior probabilities in a phyloformat tree. Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Oct 3, 2014, at 10:42 AM, David Bapst dwba...@gmail.commailto:dwba...@gmail.com wrote: Hello all, Recently, I wanted to display posterior probabilities on a 50% compatibility tree from a MrBayes run, created with the 'sumt' command. I looked around for ways to do this and found this email thread from last year: https://stat.ethz.ch/pipermail/r-sig-phylo/2013-June/002825.html ...which suggests using read.annotated.nexus() in package epibase, which has been renamed OutbreakTools more recently. Unfortunately, it appears read.annotated.nexus in OutbreakTools (v0.1-11, in R 3.1.1) improperly creates the new edge matrix, resulting in an apparent shuffling of tip labels. As this function was discussed on R-Sig-Phylo, I am sending this bug report to both the list and to the package maintainer (Thibaut J.). I don't have any BEAST output files, so I cannot test if this occurs with files that are not created by MrBayes. To show this phenomen, I have created an example .con.tre file using an example matrix of standard characters for several elephants that I stole off of Joe F.'s website. You can find the .con.tre file here on gdrive: https://drive.google.com/file/d/0B_xvEcEvKno_LTFhSVJrdHMtNFU/view?usp=sharing And now in R, a comparison with a tree: library(OutbreakTools) library(ape) tree1-read.nexus(mat.nex.con.tre) tree2-read.annotated.nexus(mat.nex.con.tre) layout(1:2) plot(tree1,no.margin=TRUE) plot(tree2,no.margin=TRUE) identical(tree1$tip.label,tree2$tip.label) identical(tree1$edge.length,tree2$edge.length) identical(tree1$edge,tree2$edge) Inspection of the tree file with FigTree suggests that the ape function is producing the correct tree and the OutbreakTools function is not. We can see the why with last three lines in the console: identical(tree1$tip.label,tree2$tip.label) [1] TRUE identical(tree1$edge.length,tree2$edge.length) [1] TRUE identical(tree1$edge,tree2$edge) [1] FALSE Closer inspection suggests that the issue appears to be that some terminal edges are inappropriately shuffled, thus shuffling the terminal tip labels on those edges, while leaving the tip.label order and the edge lengths the same. Also, larger trees seem to produce larger inconsistencies in the tree produced. So... now for the list, are there any other methods on CRAN for obtaining the posterior probabilities from a .con.tre file? Otherwise, next I guess I'll be trying phyloch. Cheers, -Dave -- David W. Bapst, PhD Adjunct Asst. Professor, Geology and Geol. Eng. South Dakota School of Mines and Technology 501 E. St. Joseph Rapid City, SD 57701 http://webpages.sdsmt.edu/~dbapst/ http://cran.r-project.org/web/packages/paleotree/index.html ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] null model to trait evolution
Hi Viviane, If you believe that your traits are correlated - that is there is a variance-covariance matrix that describes the evolutionary correlations among these traits, then I dont think you don't want to fit the models to each trait and transform the tree each time you simulate. Youd be better off estimating lambda for all traits simultaneously. You can do this in Gavin Thomas' motmot package http://gavinhthomas.wordpress.com/motmot/, and Im sure Liam has something to do this in phytools too. You can then use the trait VCV and your multivariate lambda to simulate correlated traits on an appropriately scaled tree. You could do this directly (drawing from a multivariate normal distribution after taking the Kronecker product of your phylogenetic and trait VCVs) or else you can use a function like geigers sim.char a trait vcv matrix to generate correlated traits. More importantly, I dont think you want to simulate with a lambda transform to generate your null model if youre interested in knowing whether your traits are over or underdispersed. If your multivariate trait lambda is indeed close to 1, then the phylogenetic covariances among taxa do a relatively good job of predicting similarity and your trait is not really overdispersed relative to a constant rate process like Brownian motion. I think what you really want to do is generate a VCV for the traits and then simulate correlated trait evolution under BM on your unscaled tree. You can then use some measure of dispersion in multivariate space to ask whether your observed trait variance is significantly lower or higher than expected under BM. Best, Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Aug 7, 2014, at 7:40 AM, Viviane Deslandes viviane.deslan...@gmail.commailto:viviane.deslan...@gmail.com wrote: I am investigating patterns of phenotypic overdispersion/clustering in a reginal scale. I am using a set of traits to calculate phenotypic diversity. I need to simulate this set of correlated traits to generate a null model. I would like to know whether the following approach is appropriate: - I used fitContinuous::Geiger to verify which model of evolution best fits to my traits.The models used were White Noise, Ornstein Uhlenbeck, Brownian Motion, Early Bust and lambda. - I compared the models according to its AICc values. Most of my traits fitted better to lambda model with values next to 1 (0.9). Can I transform the original tree based on lambda values for each trait separately (using transform.phylo) and then to use a function to simulate the evolution of each song trait? Which is the appropriate function to simulate ( RtraitCont, sim.corrs or fastBM)? Thank you, Viviane [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Early burst branch rescale
Hi Jon, So there are are a few ways I can imagine doing this. One thing you should probably not do though is rescale the branches of the tree directly, at least in the case of OU, as this leads to incorrect covariances for pairs of taxa where one or either does not survive to the present day (see this in press note: http://onlinelibrary.wiley.com/doi/10./2041-210X.12201/abstract), as I suspect you have in your tree. Probably the most appropriate way is therefore to 1) split the VCV into one or more matrices describing the variances and covariances over the intervals that youre interested in; 2) transform the matrices according the model and associated parameters you want over that interval, 3) stick them back together, and 4) draw your data directly from a multivariate normal distribution using the transformed covariance matrix. Ill send you some code off list to do that (its also in the corrected dryad pack accompanying the note above) Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 2, 2014, at 3:55 PM, Jonathan Mitchell mitchel...@uchicago.edumailto:mitchel...@uchicago.edu wrote: Hello all, I'm trying to figure out how to rescale certain branches of a tree according to an early burst model. First I've generated a tree using TreeSim N - 100 Nn - N*2 -1 test - sim.bd.taxa(N, 1, 1, 0.8, complete=FALSE) Then I 'paint' the branches depending on whether or not they occur in the first or second half of the clade's history: Root -max(nodeHeights(test2)) Age - 0.5*Root test2 - make.era.map(test[[1]][[1]], c(0, Age)) Using rescale() from geiger, I can reformat the whole tree, and using sim.rates() from phytools, it's easy to simulate different sigmas. But what I'd like to do is also simulate different evolutionary modes (OU and EB) in the different regimes (like Graham's MEE paper). Is there a way to rescale the branches in one portion of the tree according to the OU alpha or EB r parameters? Thanks! -- Jon _ Jonathan S. Mitchell http://home.uchicago.edu/~mitchelljs/ PhD Student Committee on Evolutionary Biology The University of Chicago 1025 57th Str, Culver Hall 402 Chicago, IL 60637 Geology Department The Field Museum of Natural History 1400 S. Lake Shore Dr. Chicago, IL 60605 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] dropping taxa from list of trees
Hi Juanita try: lapply(simtrees, drop.random, n = 937) when using lapply (or any of the apply family) you need to specify additional arguments taken by the function youre applying by name, along with their values. graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Feb 25, 2014, at 9:49 PM, Juanita Rodriguez juanitarodrigu...@gmail.commailto:juanitarodrigu...@gmail.com wrote: Hi all, I have an issue with using the drop.random to drop taxa from a set of simulated trees. When I try to give the list as input it shows an error: Error in drop.random(simtrees, 937) : object phy is not of class phylo I also tried using lapply and it keeps showing the same message. Thanks in advance for your help! Juanita _ Here is the code I have used: treesim-sim.bd.taxa.age(n=1008, numbsim=1000, lambda=9.111433e-02, mu=3.242355e-07, frac = 1, age=31, mrca = FALSE) drop.random(treesim,937) Error in drop.random(treesim, 937) : object phy is not of class phylo lapply(simtrees, drop.random (simtrees,937)) Error in drop.random(simtrees, 937) : object phy is not of class phylo -- Utah State University Department of Biology 5305 Old Main Hill Logan, UT/ USA 84322 (435) 797-0358 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://urldefense.proofpoint.com/v1/url?u=https://stat.ethz.ch/mailman/listinfo/r-sig-phylok=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=nBHH4FozRe%2BJBfaYNGYL8YxZ3Qs5Kk4u7XWcLurz4LU%3D%0As=629a8efb96259d10466e94079ab3b36b6d07e293356889f6fedc23b6c771f259 Searchable archive at https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/r-sig-phylo%40r-project.org/k=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=nBHH4FozRe%2BJBfaYNGYL8YxZ3Qs5Kk4u7XWcLurz4LU%3D%0As=5524830b2278d27fff1850cbc75e4875d7e182861b1b072624cd43cb509afad6 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] fitContinuous code and output in Geiger
Hi Louise, There is a k for all at the bottom of the output for each model fit of all traits so I didn't think that it was probably the kappa needed for Kappa You’re right - K here is the number of parameters in the model. BM has K= 2 (root state and rate) and other models add an additional parameter a lambda is given for the Kappa model fits, is it possible that this is what I need for the parameter of Kappa? You seem to be using a version of geiger 1.99. I think there was a bug in those versions where kappa was mis-named in the output but the value returned is indeed the ML estimate for kappa I get the same delta for the Delta fit of all four traits (2.), which seemed strange to me. I wasn't sure how to interpret the output of the EB model. Your value of 2. is right at the default upper bound for delta (3), which means rates of evolution are faster closer to the tips. Your EB parameter of 0 is at the lower upper bound for this parameter too— if you were to change the upper bound to 0, you’d probably find that the MLE goes up, which would suggest that rates of evolution are accelerating towards the present. I suspect you’ll also find lambda and OU fit best for these traits based on the results you have. Together, this suggests that your phylogeny does a relatively poor job of predicting trait values; younger species pairs are more dissimilar than you would predict under a BM model. Cheersm Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Nov 21, 2013, at 1:26 PM, Louise Comas lhco...@gmail.commailto:lhco...@gmail.com wrote: Dear list users, I'm having some trouble with the R code in Geiger and interpreting some of the output. I'm trying to get an aic table of model fits and parameters alpha for OU, lambda, delta, kappa, and endRate (a??) for EB for 4 continuous traits (SRL, BrFreq listed below and 2 others). There is a k for all at the bottom of the output for each model fit of all traits so I didn't think that it was probably the kappa needed for Kappa - a lambda is given for the Kappa model fits, is it possible that this is what I need for the parameter of Kappa? I get the same delta for the Delta fit of all four traits (2.), which seemed strange to me. I wasn't sure how to interpret the output of the EB model. Below are some of the code I've used and some error messages: read.nexus(33 sp nexus.nex) tempfor33-read.nexus(33 sp nexus.nex) mydata-read.table(tempfor33data.txt, sep=\t) names(mydata)-c(Sp,Myc1,Myc3,Myc5,SRL,BrFreq,MeanDiam,TissDen) mydata rownames(x) - value rownames(mydata) - mydata$Sp SRL_BM-fitContinuous(tempfor33, mydata$SRL, model=BM) Error: could not find function fitContinuous SRL_BM-fitContinuous(tempfor33, mydata$SRL, model=BM, data.names=rownames(mydata)) Error: could not find function fitContinuous I get output however when running the following, so I just used that: SRL_delta $Trait1 $Trait1$lnl [1] -169.1129 $Trait1$beta [1] 8.045267 $Trait1$delta [1] *2.99* $Trait1$aic [1] 344.2258 $Trait1$aicc [1] 345.0829 $Trait1$k [1] 3 SRL_kappa $Trait1 $Trait1$lnl [1] -170.8514 $Trait1$beta [1] 18.71486 $Trait1$lambda [1] 1 $Trait1$aic [1] 347.7028 $Trait1$aicc [1] 348.5599 $Trait1$k [1] 3 SRL_EB $Trait1 $Trait1$lnl [1] -170.8514 $Trait1$beta [1] 18.71485 $Trait1$a [1] 0 $Trait1$aic [1] 347.7028 $Trait1$aicc [1] 348.5599 $Trait1$k [1] 3 BrFreq_delta $Trait1 $Trait1$lnl [1] -44.83203 $Trait1$beta [1] 0.004308829 $Trait1$delta [1] *2.99* $Trait1$aic [1] 95.66406 $Trait1$aicc [1] 96.52121 $Trait1$k [1] 3 Many thanks for any help! Louise [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://urldefense.proofpoint.com/v1/url?u=https://stat.ethz.ch/mailman/listinfo/r-sig-phylok=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=JSgkx5KknZ4Pt%2BZH5tVIEcVsIJKepFiTdM4iwLn%2F1d4%3D%0As=de3f80616b6f3b27c2152ae78824d52ca25bff9565b853098a9d524e8d40e735 Searchable archive at https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/r-sig-phylo%40r-project.org/k=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=JSgkx5KknZ4Pt%2BZH5tVIEcVsIJKepFiTdM4iwLn%2F1d4%3D%0As=52ab21c1ab719551b38307391084de91daa0b22bdec2dc7125679da6117332fb ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Rate heterogeneity test with geiger
Hi Mercedes, The relaxed bm output summarizes the shifts sampled during the rj-mcmc, so it’s not surprising that it will show some shifts in rate along some branches. The important part of the output is the posterior probability of a shift at a given node — if this is low, then there is little support for the rate change(s) summarized in the plot. So it’s not unexpected to find variable rates estimated, but no overall support for a relaxed bm model over a single bm rate. Be aware though that Tracer uses a harmonic mean estimator of the marginal likelihood when computing Bayes factors and that this approach can sometimes perform poorly, particularly when the difference between the homogeneous and heterogenous rate models are subtle (see Baele et al. 2012) Best, Graham Ref: Baele G., et al. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Molecular biology and evolution 29.9 (2012): 2157-2167. Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology,com On Nov 20, 2013, at 12:21 PM, Mercedes Burns mercedes.bu...@gmail.commailto:mercedes.bu...@gmail.com wrote: Dear list, I am investigating a set of continuously varying morphological traits for males and females. I expected some of these traits to have a functional relationship that would translate to their tips states and evolutionary rates of change being correlated. Using PGLS (correcting for body size) and threshold modeling in phytools, I've found this seems to be the case for some but not all of my pairs of traits. I've been using the rjmcmc.bmfunction in geiger to investigate whether a lack of correlation between pairs of rates may be due to rate heterogeneity. I planned to do this by comparing the likelihoods of models of global rate Brownian motion: rjmcmc.bm(tree, trait, summary=TRUE, filebase=traitBM, ngen=5000, samp=1000, type=bm) and relaxed BM with the reversible jump process-enabled: rjmcmc.bm(tree, trait, summary=TRUE, ngen=5000, filebase=traitJMPRBM, samp=1000, type=jump-rbm) When I checked these analyses in Tracer, I was surprised to find that there is little difference in log-likelihoods of these models and it seems that even if relaxed local rates are allowed, no rate shifts are actually accepted although I have evidence of inferred jumps in my output. Is this to be expected if global rate Brownian motion is the better fitting model? Is there a problem with having the reversible jump process available? I'm also not sure how I should ultimately compare the log-likelihoods given that the model is also an experimental parameter in the jump-rbm process. I should probably also note that my tree has about 30 tips, which is on the lower end of what Eastman et al. (2011) used in their simulations. Is there a better way to approach this question? I'd appreciate any advice. -- Mercedes Burns Ph.D. Candidate BEES Program Department of Entomology University of Maryland 4112 Plant Sciences Bldg. College Park, MD 20742 (301) 405-7518 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://urldefense.proofpoint.com/v1/url?u=https://stat.ethz.ch/mailman/listinfo/r-sig-phylok=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=L5rIIFDGiwkq5Bb8P7s%2BGSlsZV5FrVFarBFP15q5NIk%3D%0As=f551dcf15ea3bfc504047291bff416df8149b7fb29d2c5dfd6848605d91ccb85 Searchable archive at https://urldefense.proofpoint.com/v1/url?u=http://www.mail-archive.com/r-sig-phylo%40r-project.org/k=diZKtJPqj4jWksRIF4bjkw%3D%3D%0Ar=9yz%2F9T5DbvC96h83J%2BUFBQ%3D%3D%0Am=L5rIIFDGiwkq5Bb8P7s%2BGSlsZV5FrVFarBFP15q5NIk%3D%0As=1918cb9345b82012461085bfb81cb59426375d5abed5d88390a39bb961449dff ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] comparing taxa in data and tree
Hi Jorge, I find that this happens most often when I forget to set the row names of data. Try rownames(data) and if they're not the tip labels, then that's your problem. Graham --- -- Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 (202) 633-1316 slat...@si.edu https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=mailto%3aSlaterG%40si.edu www.fourdimensionalbiology.com https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=http%3a%2f%2fwww.eeb.ucla.edu %2fgslater On 10/22/13 9:05 AM, Jorge Sanchez Gutierrez jorgesgutier...@unex.es wrote: Hi all, I am trying to use the geiger¹s treedata function to compare and match the taxa in my data and tree as follows: X - treedata(tr, data) where ³tr² is the tree and ³data² is the matrix. However, the tips are not found in the data and are subsequently dropped from the tree, getting a tree with 0 tips. Does anyone have any idea what might be wrong with this? I also tried this: X - data[tr$tip.label,] which also should drop all the rows of data that are not in tr, however, I get an empty matrix. Thanks in advance for any help! Jorge S. Gutierrez Conservation Biology Research Group Department of Anatomy, Cell Biology and Zoology Faculty of Sciences University of Extremadura Badajoz 06006 Spain ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] p-values and fitContinous from geiger
Hi Andrew, Simone, et al., It's probably worth adding the very minor distinction that the results of Andrew's suggested test and of phytools' phylosig test have slightly different interpretations. Assuming lambda 1, a significant LRT when comparing the fit of a lambda model to a non-lambda, BM model using geiger would indicate that phylogenetic signal in the trait data is significantly LOWER than expected given the untransformed tree (lambda = 1). But a significant LRT from phylosig's lambda test tells you that there is significantly MORE phylogenetic signal than expected under a model where lambda = 0 (or a star phylogeny). Perhaps that is obvious, but just in case... Best, Graham --- -- Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 (202) 633-1316 slat...@si.edu https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=mailto%3aSlaterG%40si.edu www.fourdimensionalbiology.com https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=http%3a%2f%2fwww.eeb.ucla.edu %2fgslater On 10/11/13 1:43 PM, Andrew Hipp ah...@mortonarb.org wrote: Dear Simone, You could do a likelihood ratio test on the lambda-fitted model vs. the no-lambda model. You could also use the AICc values returned for each model by geiger and calculate AICc weights. Brian O'Meara has a nice summary of this on his website: http://www.brianomeara.info/tutorials/aic Take care, Andrew On 10/11/2013 12:19 PM, Simone Ruzza wrote: Dear list, I was wondering whether you could tell me is there a way of obtaining a p-value for the significance of lambda when fitting a model using the fitContinuous, just in the same way the other function do this (e.g. phylosig from phytools). Apologies for the simplicity of my question, but I am a total beginner in phylogenetic and comparative methods. thanks in advance, Simone ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Andrew Hipp, PhD Senior Scientist in Plant Systematics and Herbarium Curator The Morton Arboretum 4100 Illinois Route 53, Lisle IL 60532-1293, USA +1 630 725 2094 http://systematics.mortonarb.org/lab http://quercus.mortonarb.org Lecturer, Committee on Evolutionary Biology University of Chicago http://evbio.uchicago.edu/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Methods in Ecology Evolution Special Issue: fossils and phylogenies
Dear list, I'm writing to promote this month's issue of Methods in Ecology and Evolution, which is now online and contains a special feature edited by Luke Harmon and I, titled Unifying fossils and phylogenies for comparative analyses of diversification and trait evolution. The feature contains original articles by Peter Wagner, Gene Hunt, Dave Bapst, Tom Ezard and myself, all of which aim to provide ways of integrating paleontological information into phylogenetic and comparative analyses or highlight what can be done when performing comparative analyses using phylogenies containing fossil species. I think many on this list serve will find the articles of interest. The issue is available here: http://onlinelibrary.wiley.com/doi/10./mee3.2013.4.issue-8/issuetoc. I'm also happy to provide pdfs to anyone unable to access them on request (let me know which ones you want). Cheers, Graham ( Luke) - Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 (202) 633-1316 slat...@si.eduhttps://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lXiVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=mailto%3aSlaterG%40si.edu www.eeb.ucla.edu/gslaterhttps://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lXiVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=http%3a%2f%2fwww.eeb.ucla.edu%2fgslater ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Estimating ancestral states of a trait under selection
Hi Eliot and others, You can also use the code contained in the fitContinuousMCMC package available here - https://www.eeb.ucla.edu/gradstudents/slater/Graham/code.html And described in this paper - http://onlinelibrary.wiley.com/doi/10./j.1558-5646.2012.01723.x/abstrac t To place a variety of informative priors on nodes in your tree for a continuous trait and estimate ancestral states under a range of models of trait evolution. This code will be integrated into the forthcoming geiger 2.0 Cheers, Graham --- -- Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 (202) 633-1316 slat...@si.edu https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=mailto%3aSlaterG%40si.edu www.eeb.ucla.edu/gslater https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.URL=http%3a%2f%2fwww.eeb.ucla.edu %2fgslater On 4/19/13 5:07 PM, Liam J. Revell liam.rev...@umb.edu wrote: If you want to do a Bayesian reconstruction for continuous traits with a strong prior on the root - as Brian suggests - you can do this in the phytools function anc.Bayes. The prior means variances need to be provided in anc.Bayes(...,control=list(pr.var,pr.mean)). Unfortunately, you need to put priors on everything (or nothing, in which a pretty uninformative prior is used). If you decide this is what you want to do have difficulty figuring it out, please let me know. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 4/19/2013 4:56 PM, Brian O'Meara wrote: We have a method in R, TreEvo, that can allow you to do things like have a model where the optimum shifts through time according to some factor (like external data on climate) or where there are constraints that change through time: it's up on R-forge, and you can install the package, but it hasn't been published yet and so could have problems that won't appear until peers have looked at it more thoroughly. If you're fine waiting to publish with it until it's been vetted and is in press, feel free to play with it. Of course, another thing you could do is throw a prior on the root state based on your knowledge of the climate then and then do a Bayesian reconstruction. I don't know of any packages that do this at the moment, but it wouldn't be hard to write. Best, Brian ___ Brian O'Meara Assistant Professor Dept. of Ecology Evolutionary Biology U. of Tennessee, Knoxville http://www.brianomeara.info Students wanted: Applications due Dec. 15, annually Postdoc collaborators wanted: Check NIMBioS' website Calendar: http://www.brianomeara.info/calendars/omeara On Fri, Apr 19, 2013 at 4:00 PM, Eliot Miller eliotmil...@umsl.edu wrote: Hello all, I've read some past posts and tried to figure this one out myself. It appears that the short answer is if there are not some extinct lineages in the tree with known trait values, it's difficult to do. I have a data set of the current climatic niches of a continental radiation of taxa. If I reconstruct the ancestral state of the root using ace() with REML, I get a semi-believable answer, though obviously with wide CIs. My gut instinct is that the absolute value that is returned is slightly less than it should be. The issue is that I know the climate of the continent has changed radically (but at least consistently in one direction!) during the course of the radiation of the clade. An educated guess would put the magnitude of the change at 1200 to 550 mm rain per year over 60 million years. I'd guess that most lineages have felt this aridification as a slow and steady selective pressure to figure it out in increasingly dry conditions (there's probably also been a lot of extinction but let's skip that point for now). So, while I don't have extinct lineages with known trait values, I do have what I'd be comfortable modeling as a continuous selective pressure throughout the course of the radiation. I'm interested to see how doing this would change the actual absolute value returned by the ancestral state reconstruction. Any tips on how this might be best effected? Final note, my tree is from uncorrected molecular distances, so it's not ultrametric. The analysis I'm trying to run here is tangential to my main focus, so I'm ok with making the tree ultrametric with something like chronopl() just out of curiosity to see what comes out. Someday I'll have a time calibrated tree and then I can re-run this analysis. Best wishes, Eliot [[alternative HTML version
Re: [R-sig-phylo] Flat likelihood and linear transformation of traits in fitContinuous
Hi all, I think there might be 2 issues to consider here. The first concerns data transformations and how we consider traits to evolve over phylogenies. Many people would consider it most appropriate to use log transformed values of a continuous trait for model fitting purposes; for example, using raw values of body mass in mammals would make the assumption that a 1gram change in mass in a baleen whale is equivalent to a 1 gram change in a murine rodent. If you follow this through, the only way in which one expects to obtain the same set of AIC scores for data multiplied by some constant is by fitting models to log(data) and log(data * constant). You can see this easily using some simulations (see below). I would expect fitting models using log transformed and untransformed data to lead to different model fits. Others may have different opinions on how/if data should be transformed The other issue is what your ML parameter estimates are telling you. A flat likelihood surface for a single optimum OU process is probably indicating that there is little phylogenetic signal in your data, rather than suggesting occupation of some kind of adaptive zone. If so, there's not really much conflict between finding most support for white noise or OU. You could check this out visually by plotting your phylogeny transformed by the ML alpha parameter, e.g. plot(OUtree(phy, alpha = fitContinuous(phy, data, model = OU)$Trait1$alpha) Here is some example code to look at model fits using various transformations of trait data require(TreeSim) require(phytools) phy - sim.bd.taxa(n = 100, numbsim = 1, lambda = 0.1, mu = 0, frac = 1, complete = ,F)[[1]][[1]] # simulate a tree of extant taxa d - fastBM(phy, sig2 = 1); ## simulate data under BM (assuming a log scale) d2 - exp(d) # back transform these data to raw measurements d3 - d*10 # multiply the original data (in log units) by 10; d4 - d2 * 10; # multiply the raw data by 10 d5 - log(d4); # log the raw data *10 ## fit bm and ou to the simulated data fitContinuous(phy, d, model = BM)$Trait1$aicc fitContinuous(phy, d, model = OU)$Trait1$aicc ## now fit them to the simulated data back transformed to raw values fitContinuous(phy, d2, model = BM)$Trait1$aicc fitContinuous(phy, d2, model = OU)$Trait1$aicc ## or log values *10 fitContinuous(phy, d3, model = BM)$Trait1$aicc fitContinuous(phy, d3, model = OU)$Trait1$aicc ## raw values *10 fitContinuous(phy, d4, model = BM)$Trait1$aicc fitContinuous(phy, d4, model = OU)$Trait1$aicc ## and log(raw data *10) fitContinuous(phy, d5, model = BM)$Trait1$aicc fitContinuous(phy, d5, model = OU)$Trait1$aicc Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 slat...@si.edu www.eeb.ucla.edu/gslater From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] On Behalf Of Antigoni Kaliontzopoulou [antig...@cibio.up.pt] Sent: Tuesday, November 20, 2012 5:50 AM To: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Flat likelihood and linear transformation of traits in fitContinuous Hi Florian, this is a problem frequently encountered in model fitting procedures and, yes, it has to do with a flat likelihood surface, or with the algorithm getting stuck in local optima. Unfortunately, as far as I know, there is nothing you can really do about it (other than maybe trying to fit the models with a different algorithm and see if you get reasonable estimates). An important point is, I think, that comparing AICs is useful, but it will not inform you on whether your models have really converged on a solution and whether the estimates they provide make sense. I think multiplying the trait by 10 should change proportionally your sigma^2 and optima, but (if there was a real ML solution) it should not change the rest of model descriptors. A fist way to get a feeling of whether you really are stuck with a strange likelihood surface would be to fit the models using different algorithms (but it sounds like you have already tried this): a well-optimized model will not change across algorithms (other than small differences in AICc calculation etc). A suggestion might be to also try fitting the models in OUwie. It uses exactly the same parameter and likelihood estimators as geiger's fitContinuous, but it will provide some additional results that may be helpful for evaluating model fitting (i.e. SEs for the estimated model parameters, eigencomposition of the Hessian matrix used for optimization etc). Check Beaulieu, J. M., Jhwueng, D. C., Boettiger, C., O'Meara, B. C. (2012). Modeling Stabilizing Selection: Expanding the Ornstein-Uhlenbeck Model of Adaptive Evolution. Evolution, 66, 2369--2383. doi:10./j.1558-5646.2012.01619.x Hope this helps A On 11/20/2012 10:30
Re: [R-sig-phylo] fitContinuous-Early Burst Model
Hi Nic and list, When using Maximum Likelihood to estimate model parameters, it's often useful to define bounds on the search, either for computational efficiency (some parameters, for example a BM rate of 1000 for log body mass in kgs, are so unlikely as to not be worth looking at) or for reasons due to the effect of a particular value on our ability to actually evaluate the likelihood. Luke Harmon programmed default bounds for all the trait evolution parameters when he wrote fitContinuous. If you go into the code, you'll be able to find them here: bounds.default - matrix(c(1e-08, 20, 1e-07, 1, 1e-06, 1, 1e-05, 2.99, 1e-07, 50, -3, 0, 1e-10, 100, -100, 100), nrow = 8, ncol = 2, byrow = TRUE) rownames(bounds.default) - c(beta, lambda, kappa, delta, alpha, a, nv, mu) colnames(bounds.default) - c(min, max) bound.default # print them to the screen The EB parameter is a, and you can see that the default bounds are -3 and 0. As I mentioned in an earlier response, the magnitude of the lower bound can affect your ability to fit the model to some datasets and this is especially so for EB: too negative a value can result in a VCV that is effectively or actually singular, meaning the likelihood cannot be evaluated at that parameter and the search will fail. You need to chose an appropriate lower bound if this is the case. One way to do this is to just play around with values for the bound to see if you can get it to work. If you go this route, you'll need to try a number of values and check that your ML estimate of a is not at the bound (i.e. your bound is too high and your estimate is the bound). The other option i suggested is to compute a reasonable maximum lower bound given the age of your tree; by that I mean the depth in time from root node to the present day or, in the case of a paleo tree, the youngest tip (although as we actually want the total amount of evolutionary time elapsed over the phylogeny, you can also think of your youngest tip as existing at the present). If the root - tip distance is T, and we place a lower limit on what we will allow the magnitude of the rate at the end of the EB process to be, expressed as a fraction of the starting rate (I think i explained this incorrectly before and implied that it was the actual rate at the tip) and we call this min.rate, then a reasonable lower bound for the EB model is log(min.rate) / T. I've used 10^-5 as a min.rate as, in my experience, rates for real data almost never decay beyond this amount, although I'm sure there are cases where this happens. At any rate, this almost always results in a reasonable lower bound for the EB process. In terms of running this on sections of your tree, I assume you are pruning out clades and fitting EB to them separately? In that case you would just obtain a reasonable bound for each by computing the depth from root node to tip in each pruned clade. Hope that helps. Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 slat...@si.edu www.eeb.ucla.edu/gslater From: Nicolas Campione [nicolas.campi...@mail.utoronto.ca] Sent: Monday, November 05, 2012 2:04 AM To: Slater, Graham Cc: David Bapst; Frank Burbrink; r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] fitContinuous-Early Burst Model Hello All, Great discussion, I'm learning a lot about the inner workings of some of these packages. So for starters, all my taxa are fossil taxa, no extant, and in terms of branch lengths the shortest branches seem to be internal edges, which can be as short as 0.15 Ma (whole tree spans over 160 Ma). I initially tried Dave's suggestion to add a small amount of time to the branches. This initially worked perfectly, but then, once I tried time scaling the tree using other functions, or tried running the analysis with a slightly different topology, the error re-occurred. So, I would like to understand how to use the 'bounds' option a little better, as it seems that this is the most appropriate solution. I tried playing around with it and it works, but it seems to give me results that are different to those that I have previously obtained. I should expand here and say that I am also trying to fit the models at various points within the tree and it seems to me that the bound values will have to be modified depending where I am. Graham, you suggested that I compute the value of X in bounds = list(a=c(X,0)). Could you elaborate on this a bit more? When you say depth, is this node-based or edge length-based? And in terms of determining the minimum rate, I'm guessing you are referring to where the evolutionary rate is the lowest in the tree, so in the case of body size say, the lowest kg/Ma
Re: [R-sig-phylo] fitContinuous-Early Burst Model
I think it's convenient to think about how fitContinuous actually works here, or at least to think about the impact of the EB parameter on a transformed tree's branch lengths. Using simulations, try this: require(TreeSim); require(geiger); phy - sim.bd.taxa(50, 1, 0.1, 0.05, 1)[[1]][[1]]; ## simulate a 50 (extant) taxon tree with fossils phy$edge.length; # look at the branch lengths ebphy - exponentialchangeTree(phy, a = -0.4); ## now transform this tree under a strong EB process ebphy$edge.length; # take a look at these edge lengths plot(ebphy); # and plot it what you'll probably find is that the exponentialchangeTree has a lot of very very short edges and from plotting it you'll see that the only long edges are those towards the root. This is such a strong burst that it basically requires all the evolutionary change in our trait to occur in the two edges leading from the root. Such a burst is unlikely in real data. I find it helpful to think about rate half lives. An EB parameter of -0.4 gives a rate half life of log(2) / 0.4 = 1.732868 time units - that is it takes ~1.8 million years for the rate to halve from its initial value. To give an idea of what this translates to, the average mammalian order is ~ 50 my old, which would result in approximately 30 half lives elapsing. That, at least to me, has an exceedingly low prior probability! So while I agree with Dave that some paleo trees will have short internal edges or terminals that might mess up the EB model, I think what is more important is to recognize the effect that the EB parameter has on your edge lengths and to chose an appropriate lower bound when fitting the model to avoid this. Probably, for most paleo trees, an exponential change parameter of -0.4 is never going to be reasonable. Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 slat...@si.edu www.eeb.ucla.edu/gslater From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] On Behalf Of David Bapst [dwba...@uchicago.edu] Sent: Sunday, November 04, 2012 4:04 PM To: Frank Burbrink Cc: r-sig-phylo@r-project.org; Nicolas Campione Subject: Re: [R-sig-phylo] fitContinuous-Early Burst Model Yeah, my simulations with paleotree agrees with Frank's, except no values below a=0 will work at all without using addTermBranchLength; apparently the terminal ZLBs are having some effect on it. When I use addTermBranchLength, then I can get down to -0.4, at which point I get a slightly new warning message: fitContinuous(addTermBranchLength(trueTree),trait,model=EB,bounds=list(alpha=c(-0.4,0))) Fitting EB model: Error in solve.default(phyvcv) : system is computationally singular: reciprocal condition number = 2.04244e-16 On Sun, Nov 4, 2012 at 2:48 PM, Frank Burbrink frank.burbr...@csi.cuny.eduwrote: Just as a followup to my example (which contains numerous extinct taxa). Using these values for alpha fails: fitContinuous(tree,data,model=EB,bounds=list(alpha=c(-1,0))) Fitting EB model: Error in solve.default(phyvcv) : Lapack routine dgesv: system is exactly singular Above the lower bound of -0.4 (e.g., -0.3, -0.2 etc.) Works here: fitContinuous(tree,data,model=EB,bounds=list(alpha=c(-.3,0))) Fitting EB model: $Trait1 $Trait1$lnl [1] -264.4632 $Trait1$beta [1] 0.9233574 $Trait1$a [1] 0 $Trait1$aic [1] 534.9264 $Trait1$aicc [1] 535.1713 $Trait1$k [1] 3 On Sun, Nov 4, 2012 at 1:07 PM, Graham Slater gsla...@ucla.edu wrote: Hi Nic, It's not a problem with the branch lengths, at least not the non-ultrametricity of the tree. It's probably that the lower bound on the exponential change parameter is too low, resulting in a singular phylogenetic variance covariance matrix. Try changing the lower bound as follows: eb.fit - fitContinuous(phy, data, model =EB, bounds = list(a=c(X, 0))) # where X is the lower bound you chose. You can compute an appropriate value for X if you're willing to choose a minimum rate for the end of the EB process. In this case, the lower bound would be log(min.rate) / T where T is the depth of your tree and min.rate is the ending rate. I've found a value of 10^-5 is reasonable in most cases. Graham Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 Washington DC 20013-7012 slat...@si.edu www.eeb.ucla.edu/gslater On Nov 4, 2012, at 11:59 AM, Nicolas Campione wrote: Dear R Phylo List, I'm trying to test an Early Burst model of evolution against a tree using the 'fitContinuous' function in 'geiger'. However, I get the