Re: [R-sig-phylo] Bug in ape?
False alarm. I cleared my workspace and re-started R and the problem has gone away. I'm curious to know how it occurred but I'm happy that it has been resolved. We now return you to your scheduled R programming... Cheers, Simon. On 27/04/17 12:44, Simon Blomberg wrote: Hi Emmanuel and other list members. I am having some problems creating and working with trees (phylo objects) in ape. Has anyone else seen this error? The following code reproduces the errors: library(ape) tr1 <- rtree(10) plot(tr1) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found tr2 <- rcoal(10) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found tr3 <- read.tree(text="((a, b), (c, d));") compute.brlen(tr3) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found ## All have the same traceback: > traceback() 4: .reorder_ape(x, order, index.only, length(x$tip.label), io) 3: reorder.phylo(phy, "postorder") 2: reorder(phy, "postorder") 1: compute.brlen(tr3) > sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 17.04 Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.19.so locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ape_4.1 loaded via a namespace (and not attached): [1] Rcpp_0.12.10 magrittr_1.5 splines_3.4.0 MASS_7.3-47 [5] munsell_0.4.3colorspace_1.3-2 lattice_0.20-35 minqa_1.2.4 [9] stringr_1.2.0glmmADMB_0.8.3.3 plyr_1.8.4 tcltk_3.4.0 [13] tools_3.4.0 parallel_3.4.0 nnet_7.3-12 grid_3.4.0 [17] nlme_3.1-131 gtable_0.2.0 coda_0.19-1 fortunes_1.5-4 [21] lme4_1.1-13 lazyeval_0.2.0 tibble_1.3.0 Matrix_1.2-8 [25] R2admb_0.7.15nloptr_1.0.4 ggplot2_2.2.1 effects_3.1-2 [29] stringi_1.1.5compiler_3.4.0 scales_0.4.1 I am not getting the error on two other machines with the same OS and (as far as I can tell) the same setup. Any help would be greatly appreciated! Cheers, Simon. -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. And often to understand something you have to work it out for yourself because no one else has done it. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Bug in ape?
Hi Emmanuel and other list members. I am having some problems creating and working with trees (phylo objects) in ape. Has anyone else seen this error? The following code reproduces the errors: library(ape) tr1 <- rtree(10) plot(tr1) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found tr2 <- rcoal(10) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found tr3 <- read.tree(text="((a, b), (c, d));") compute.brlen(tr3) Error in .reorder_ape(x, order, index.only, length(x$tip.label), io) : object 'neworder_phylo' not found ## All have the same traceback: > traceback() 4: .reorder_ape(x, order, index.only, length(x$tip.label), io) 3: reorder.phylo(phy, "postorder") 2: reorder(phy, "postorder") 1: compute.brlen(tr3) > sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 17.04 Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.19.so locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ape_4.1 loaded via a namespace (and not attached): [1] Rcpp_0.12.10 magrittr_1.5 splines_3.4.0 MASS_7.3-47 [5] munsell_0.4.3colorspace_1.3-2 lattice_0.20-35 minqa_1.2.4 [9] stringr_1.2.0glmmADMB_0.8.3.3 plyr_1.8.4 tcltk_3.4.0 [13] tools_3.4.0 parallel_3.4.0 nnet_7.3-12 grid_3.4.0 [17] nlme_3.1-131 gtable_0.2.0 coda_0.19-1 fortunes_1.5-4 [21] lme4_1.1-13 lazyeval_0.2.0 tibble_1.3.0 Matrix_1.2-8 [25] R2admb_0.7.15nloptr_1.0.4 ggplot2_2.2.1 effects_3.1-2 [29] stringi_1.1.5compiler_3.4.0 scales_0.4.1 I am not getting the error on two other machines with the same OS and (as far as I can tell) the same setup. Any help would be greatly appreciated! Cheers, Simon. -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. And often to understand something you have to work it out for yourself because no one else has done it. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] summary stats for comparative methods p-values
I can think of 2 better methods: 1. Bayes. Sample from the set of 100 trees, used as a prior on the "true" tree structure (assuming they are a true posterior set, as Joe points out). See: de Villemereuil et al. BMC Evolutionary Biology 2012, 12:102 http://www.biomedcentral.com/1471-2148/12/102 2. Model averaging using information theoretic methods. See package MuMIn. From a Bayesian perspective, this is problematic as the Akaike prior on each model depends on the data. But I don't think it is as problematic as averaging p-values. I don't like the methods that have previously been suggested. Think about the definition of a p-value: The probability of obtaining a statistic at least as extreme as that observed, conditional on the null hypothesis being true. What is the null hypothesis in this case? Probably that the estimate for a parameter is zero. For any single analysis (say via PGLS), this has to be conditional on the tree structure being correct! But you are using 100 different tree structures. Perhaps only one is correct (if there are no duplicates), and quite possibly none of them are correct. So trying to treat p-values as some sort of variable that you can obtain summary statistics for such as the median, mean, standard deviation etc. makes no sense because each p-value is defined in terms of a different null hypothesis for every analysis. This is different to when you might be testing the same null hypothesis, but on different data sets. For example, you might be trying to replicate a study from somebody else's lab (there should be more of this.) Then the distribution of p-values from different data sets should be Uniform under the null hypothesis. It is also different from meta-analysis methods where p-values may be combined from sources using different data. HTH, Simon. On 11/03/16 05:35, Simon Joly wrote: Alternatively, the proportion of trees that gave in a significant result (for a given threshold) could be of interest. It depends on your question. Simon Simon Joly, Ph.D. Chercheur, Jardin botanique de Montréal / Espace pour la vie Professeur associé, Dept. sciences biologiques, Université de Montréal Institut de recherche en biologie végétale (IRBV) 4101 Sherbrooke E., Montréal (QC) H1X 2B2, Canada T. +1 514.872.0344 / www.plantevolution.org 2016-03-10 11:41 GMT-05:00 David Bapst <dwba...@gmail.com>: Darrin, list- I'm sure there's people on this list with better answers, so I'll throw in first with what might be the wrong answer (but feels right to me), and say you more or less need to report all of them: like, show a full histogram of the p-values. At least, as a reviewer, that is what would convince me whether there was evidence or not to reject a hypothesis. But I'm sure there's some statistical argument again that too, in terms of taking a frequentist perspective across multiple versions of the same dataset. To the list: I look forward to hearing how I am wrong! ;) -Dave On Thu, Mar 10, 2016 at 4:54 AM, Darrin Hulsey <darrinhulseymin...@outlook.com> wrote: I am running a series of statistics on a subset of 100 trees that returns 100 different p-values. I was wondering what the best way to report summary statistics for these 100 p-values would be (median?, measure of variance in all 100 p-values?). Thanks for any insight. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- David W. Bapst, PhD Adjunct Asst. Professor, Geology and Geol. Eng. South Dakota School of Mines and Technology 501 E. St. Joseph Rapid City, SD 57701 http://webpages.sdsmt.edu/~dbapst/ http://cran.r-project.org/web/packages/paleotree/index.html ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, whic
Re: [R-sig-phylo] interpreting pGLS
Hi Rizwana, There is no reason why we would expect the phylogenetic signal in the raw variables to be the same as the phylogenetic signal from the regression analysis (K or lambda), as in the regression, you are really looking at the phylogenetic signal of the _residuals_, which may be quite different. Also, the r-squareds from pgls in caper are not calculated in the same way as r-squareds from an ordinary least squares analysis. In fact, there is no "correct" way to calculate r-squared for any model other than the OLS model, as OLS r-squared is based on the residual variance. For other types of models, including GLS, "residual variance" is not a well-defined concept. Personally, I don't use r-squared for any model other than OLS models. There are far better ways to conduct model criticism. Cheers, Simon. On 28/01/16 15:57, Rizwana Rumman wrote: Dear R-sid-phylo list, I am having some problems to interpret the results of a co-evolution analysis and I hope you can help me to get my head around things. I have two variables that each show significant non-random phylogenetic signal (K and lambda) and appear to be significantly correlated in a non-phylogenetic regression; but when I am performing a pGLS in caper, the lambda is estimated to be zero and the two variables shows the exact same R-square and level of significance as estimated from the non-phylogenetic regression analysis. I also have two other sets of variables that show lambdas >0 (very similar values from caper and ape) in the pGLS analysis, but for these variables, the R-square values are smaller from pGLS than those estimated from non-phylogenetic regression. I guess my question is if I can safely interpret that, in the first case, phylogenetic history does not significantly affect my trait correlations (indicated by lambda=0 and very similar simple linear regression and pGLS results) and in the latter case (i.e. smaller R-squares after correcting for phylogeny), non-phylogenetic regressions slightly overes! ti! mates the correlations of the traits. Many thanks for the help! Cheers, Rizwana [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] plot trees distribution
As well as Liam's suggestions, have you looked at DensiTree? https://www.cs.auckland.ac.nz/~remco/DensiTree/ There is also the function densiTree in package phangorn that does the same type of thing. Cheers, Simon. On 06/01/16 09:01, Giulio V. Dalla Riva wrote: > > Dear Everybody, > > > I have a list of phylogenies downloaded from the wonderful > birdtree.org (in fact, it's a multiPhylo object) that I want to plot. > > > The idea is obtain a representation of the empirical distribution of > the phylogenies, in a graphical style similar to the > phytools::fancyTree(type = "phenogram95") representation of the > ancestral state reconstruction of a continuous character (see the > fourth plot in http://www.phytools.org/eqg/Exercise_5.2/ > <http://www.phytools.org/eqg/Exercise_5.2/>): > > plot of chunk unnamed-chunk-6 > > > In my case I would like to overlap each tree in the list, in some nice > way. I can ladderize the first tree and use the produced tip order for > all the other trees. This should (at least in my mind) produce an > immediate representation of the metric and topological > variance present in the list (which, in my case, is reasonably little). > > > I could just compute a bootstrap, and plot support values on the > splits. But it's not the same. > > > I thought to remember there was a builtin function in some package, > but I may be wrong. > > I have attached the nexus file for reference (this is a 100 trees, > usually we would have 1000 or more). > > > Best wishes, > > > Giulio Valentino Dalla Riva > > P.S. A bonus point if anybody proposes a solution that works > with|plotSimmap| for painted trees. > > PhD @ Biomathematics Research Centre > Room 523 - Erskine Building > University of Canterbury > Christchurch - New Zealand > > phone: +64 3 364 2987 ext 4869 > > > ___ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] How to embed a function within a matrix
You need to treat the functions as expressions. You cannot mix data types in a matrix (or vector) so you cannot have elements of which some are numbers and some are functions. But if you treat all elements as expressions, it is possible to work with that. A long time ago I wrote some code showing how to do this for elasticity and sensitivity analysis of vital rates. See this thread: https://stat.ethz.ch/pipermail/r-help/2007-October/143848.html and in particular, https://stat.ethz.ch/pipermail/r-help/attachments/20071027/423b83ac/attachment.pl The code is implemented in popbio::vitalsens (thanks to Chris Stubben), which you should study. See ?popbio::vitalsens for examples. Good luck! Simon. PS Isn't this question more appropriate for r-sig-ecology? On 05/01/16 08:08, Alexandre F. Souza wrote: Dear friends, I am creating a population ecology course for undergraduate students in Brazil. I have already showed how to proceed with basic population matrix models in R using basic functions as well as the popbio package and its functions that simulate stochastic dynamics of deterministic matrices. However, I have not been successful in finding resources to implement accessible density dependent matrix dynamics. I find quite complex IPM models, which are out of the scope of the course, or too simplistic approaches like simple ceiling. I need to use a matrix of transitions in which at least one of the transitions would be replaced by a density dependent function related to the density of a given stage. Something like C = matrix(c(0, 0, 5.905, 0.368, 0.639, 0.025, 0.001, 0.152, 0.051), nrow = 3, byrow = TRUE, dimnames = list(fases, fases)) seedlings vegetative reproductive seedlings 0.000 0.000 5.905 vegetative 0.368 0.639 0.025 reproductive0.001 0.152 0.051 This matrix would be repeatedly multiplied by a vector of population densities like fases = c(1,2,3,4,5,6,7,8) However, I need to replace for instance C(2,1) by a function related to the density of seedlings in the previous year, like Value = 0.568/(1+0.584*N), in which N is the number of seedlings in the previous year. Do anyone have an idea on how to implement that? I implemented that in Excel, but since I have been showing all course contents in R I would not like to restrict this theme to Excel. Thank you very much in advance, Alexandre -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Non Parametric PGLS
Hi Sérgio. Liam is right. But we do expect the normalised residuals to be approximately Normal. You can calculate the normalised residuals by pre-multiplying the raw residuals by the inverse of the Cholesky decomposition of the phylogenetic variance-covariance matrix, and then dividing by the estimate of the residual standard deviation (ie sigma). You may have to plug in a value for any parameters that are co-estimated along with the regression (Pagel's lambda, etc.). If you use the gls function in the nlme package to fit your model, then it's all easy: fit - gls(response~explanatory, correlation=corPagel(1, phy=my.tree), data=dat)) res - residuals(fit, type=normalized) Then you can do some test for normality on those (I don't ordinarily recommend such things, although SnowsPenultimateNormalityTest in the TeachingDemos package is the best I have seen). More usefully, you can do a normal quantile-quantile plot to graphically see whether your normalised residuals are normal enough: qqnorm(fit, form=~residuals(., type=n), abline=c(0,1)) See page 239 in Pinheiro and Bates (2000) Mixed-effects models in S and S-PLUS. Cheers, Simon. On 18/06/15 03:09, Liam J. Revell wrote: Hi Sérgio. It might be worth pointing out that we do not expect that the residuals from a phylogenetic regression to be normal. I described this with respect to the phylogenetic ANOVA on my blog (http://blog.phytools.org/2013/02/a-comment-on-distribution-of-residuals.html), but this applies equally to phylogenetic regression. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 6/17/2015 12:40 PM, Sergio Ferreira Cardoso wrote: Hello all, I'm having a problem with a Multiple Regression PGLS analysis that I'm performing. The residuals are not normal and it's difficult to bring them to normality. In these cases, are there any alternatives to the linear model? I know that for non-phylogenetic analyses other models exist, but is there any alternative method for phylogenetic analysis? Thanks in advance. Best regards, Sérgio. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] [R] APE
It's hard to tell what you did exactly. Did you do: mytree - compute.brlen(mytree) ? That should have worked. I've cc'ed this message to r-sig-phylo, which is the more appropriate forum. Cheers, Simon. On 10/03/15 07:05, Beth Williams wrote: Dear All, I am having some trouble with R and would be extremely grateful if anyone has a way around this. I have loaded a nexus tree from PAUP into R using the command read.nexus and this loaded, it was reported as rooted; with no branch lengths. I then used the command compute.brlen(mytree) to compute the branch lengths and this was reported as rooted; includes branch lengths. I added my community data (samp) and then tried to compute the phylogenetic diversity with the command pd(samp, mytree, include.root=TRUE) however it said it could not calculate the PD as there were no branch lengths. Is there a way to incorporate the branch lengths into the calculation for PD? Thanks for your time, Beth [[alternative HTML version deleted]] __ r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] The Value of Responding on R-sig-phylo, was: phytools - evaluating significance of pgls.Ives
Hi Everyone, I agree with what has been said so far. Tony, don't take my comments to heart. Your contribution is most welcome as your opinion is highly respected (by me and hopefully by others). Same goes for all the other authorities on the list e.g Liam, Joe, Ben, Jarrod, Ted, Marguerite and others. I don't contribute much these days due to time constraints, which is a shame because I think that lists like this do help people, even if particular answers to particular questions are not quite right or are incomplete in some sense for the original poster. And I like helping people too! Questions can get answers that evolve into interesting threads in their own right (like this one, although a bit off-topic). Mailing lists like this are more of a group conversation rather than a simple do this or don't do that type of thing. I've been on lists since the '90s (does anyone remember sci.bio.evolution on USENET?). This is one of the best lists I've been on, largely because I know that some of the best researchers in the field read the list and sometimes respond. Tony, I know that statistical advice is easier to give when the person and data are live right in front of you. That's why I invariably only see people by appointment when I give stats advice. I don't like giving advice over email and I don't give advice over the phone. (Actually, I never answer my phone but that's another story.) It's a bit different on the list because other authorities can read the advice an chime in if they disagree with anything I say. Hopefully we can all benefit from that. Keep on posting everybody! Simon. On 03/03/15 03:43, Hilmar Lapp wrote: +100 !! -hilmar On 3/2/15, 11:07 AM, David Bapst wrote: Off-topic, but I wanted to comment on Anthony's wariness about commenting on R-sig lists... Yes, it can certainly be very difficult to give advice that is tailored specifically to the problem of a particular worker on R-sig lists, particularly as one can't just tell the other person to open their data files and show them to you. However, answering a question on R-sig lists (regardless of whether it is the right answer, but what answer remains right forever anyway?) records that answer virtually forever, for future posterity. This means the answer can be found by any future individuals who encounter this problem, via a simple google search. Since I first found R-sig-phylo in early 2010 (five years ago!), I have made almost constant use of the knowledge base contained with R-sig-phylo's archive. It doesn't mean the information is always right, or always the best answers for the actual person who asked originally, but it certainly helps point out the right line of thinking or the right literature to investigate. StackExchange discussion archives have become just as valuable (but there appears to be very little R phylo discussion over there). It feels like the amount of discussion on the list has dropped off slowly over the last year, and as phylogenetics in R seems as widely used as ever, I have often wondered if this reflects that most issues people may run into are now more easily solved with google searches leading to the R-sig-phylo archives. Anyway, I just hope that this element is not forgotten about why I think replying to question on R-sig-phylo remains a valuable contribution to our community! Cheers, -Dave Bapst On Mon, Mar 2, 2015 at 7:58 AM, Anthony Ives ari...@wisc.edu wrote: Simon and Ben, Of course, sample size of 8 is going to be an issue in almost any analysis. But sometimes that is all the data there are. Incidentally, this exchange reminded me that I’m still wary of making comments on r-sig. If somebody comes into my office, I have the time to discuss with them their data, so I can learn more about it. Then I feel I can at least make informed recommendations for analyses — they might still be badly wrong recommendations, but at least they are informed. I’m still uncomfortable about making suggestions on r-sig, when I don’t really have full information, or the time to think. Therefore, the few comments I’ve made have been very general about methods, rather than specific about data sets. I think this is just a matter of me waking up to the 21st century. I do like the idea of crowdsourcing; I just need to get comfortable with it. Cheers, Tony Anthony Ives Department of Zoology 459 Birge Hall (4th floor, E end of bldg) UW-Madison Madison, WI 53706 608-262-1519 On Mar 1, 2015, at 10:53 PM, Simon Blomberg s.blombe...@uq.edu.au wrote: Hi Ben, Yes, you would have to assume constant variance across species to use N=24. I think that is the only option. But given that biological data often has a positive mean-variance relationship, again I'm dubious about the exercise. YMMV, however! Cheers, Simon. Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland
Re: [R-sig-phylo] phytools - evaluating significance of pgls.Ives
Am I missing something? The OP only has 8 species in the data set. I wouldn't put much store in fancy PCM modelling based on such a small data set. And 3 individuals per species is not enough for a good estimate of the within-species variance. Simon. Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell From: R-sig-phylo [r-sig-phylo-boun...@r-project.org] on behalf of Anthony R Ives [ari...@wisc.edu] Sent: Monday, March 02, 2015 2:14 PM To: Andrea Berardi Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] phytools - evaluating significance of pgls.Ives Andrea, I second Liam’s recommendation to use a LRT. For measurement error, the latest code I have in matlab is MERegPHYSIGv2.m, which does both measurement error and an OU or Pagel-lambda transform (see Johnson, M. T. J., A. R. Ives, J. Ahern, and J. P. Salminen. 2014. Macroevolution of plant defenses against herbivores in the evening primroses. New Phytologist 203:267-279). Measurement-error models are always going to have difficulties at parameter boundaries; for example, if the assumed measurement error is large, it can exceed the observed variation in the data, which of course causes problems (statistical and logical). In MERegPHYSIGv2.m, I did a round or two of simulated annealing first, before polishing the results with a Nelder-Mead optimizer. It seems like you could do the same with Liam’s code pretty easily by changing the method of optimization (using edit()). Before doing this, thought, I would take a careful look at your data and your estimates of measurement error. An easy diagnostic is to start with 10% of your estimated measurement standard errors and then increase slowly to 100%. When I have done this, I’ve been able to see problems when parameter values go awry. It is not a fail-safe diagnostic in any way, but it can help. Cheers, Tony Anthony Ragnar Ives Department of Zoology UW-Madison Madison, WI 53706 608-262-1519 On Mar 1, 2015, at 9:42 PM, Liam J. Revell liam.rev...@umb.edu wrote: Hi Andrea. This is not presently implemented, but since this is a likelihood method it would be straightforward to constrain to a slope of zero and then do a LR test. This would be probably be the easiest way to test a hypothesis about the regression. That being said, as noted in the function documentation, some problems have been reported with the optimization algorithm for this model, which is simple and thus may fail to find the ML solution. Consequently, I would encourage you to look for other implementations of the method so that you can be confident in your result. I'm not aware of one in R at this time. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 3/1/2015 10:31 PM, Andrea Berardi wrote: Hi all, I'm just learning how to do PGLS analyses, and I'm looking for advice on how to evaluate the significance of the regression fit using pgls.Ives in the phytools package. I'm using this function because it incorporates sampling error of species means, and my data has about 3 individuals per species, with 8 species. My goal is to test whether a flower trait predicts the leaf trait, while controlling for shared ancestry. Here is the output from pgls.Ives: fit - pgls.Ives(Tree, Flower_trait, Leaf_trait) fit $beta [1] 96.3963098 0.1292656 $sig2x [1] 22218901073 $sig2y [1] 23027587 $a [1] -10063.150 -1204.422 $logL [1] -158.2337 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH I am also running pgls on species averages for the traits using the gls function in nlme and the corBrownian and corMartins functions in ape. But, we are interested in incorporating the within-species variation in our small dataset. Any suggestions would be welcome! Thanks for your help, Andrea ~~ Andrea Berardi, PhD Postdoctoral Researcher, Smith Lab EBIO, University of Colorado-Boulder ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable
Re: [R-sig-phylo] phytools - evaluating significance of pgls.Ives
Hi Ben, Yes, you would have to assume constant variance across species to use N=24. I think that is the only option. But given that biological data often has a positive mean-variance relationship, again I'm dubious about the exercise. YMMV, however! Cheers, Simon. Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell From: R-sig-phylo [r-sig-phylo-boun...@r-project.org] on behalf of Ben Bolker [bbol...@gmail.com] Sent: Monday, March 02, 2015 2:49 PM To: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] phytools - evaluating significance of pgls.Ives -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 15-03-01 11:40 PM, Simon Blomberg wrote: Am I missing something? The OP only has 8 species in the data set. I wouldn't put much store in fancy PCM modelling based on such a small data set. And 3 individuals per species is not enough for a good estimate of the within-species variance. Simon. Agree wholeheartedly with the first point -- but for the second, isn't 24 rather than 8 the relevant number for estimating within-species variance (since presumably we are assuming the same variance within every species, thus we can effectively pool within-species variation \across species for this purpose) ? Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Basically, I'm not interested in doing research and I never have been. I'm interested in understanding, which is quite a different thing. - David Blackwell From: R-sig-phylo [r-sig-phylo-boun...@r-project.org] on behalf of Anthony R Ives [ari...@wisc.edu] Sent: Monday, March 02, 2015 2:14 PM To: Andrea Berardi Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] phytools - evaluating significance of pgls.Ives Andrea, I second Liam’s recommendation to use a LRT. For measurement error, the latest code I have in matlab is MERegPHYSIGv2.m, which does both measurement error and an OU or Pagel-lambda transform (see Johnson, M. T. J., A. R. Ives, J. Ahern, and J. P. Salminen. 2014. Macroevolution of plant defenses against herbivores in the evening primroses. New Phytologist 203:267-279). Measurement-error models are always going to have difficulties at parameter boundaries; for example, if the assumed measurement error is large, it can exceed the observed variation in the data, which of course causes problems (statistical and logical). In MERegPHYSIGv2.m, I did a round or two of simulated annealing first, before polishing the results with a Nelder-Mead optimizer. It seems like you could do the same with Liam’s code pretty easily by changing the method of optimization (using edit()). Before doing this, thought, I would take a careful look at your data and your estimates of measurement error. An easy diagnostic is to start with 10% of your estimated measurement standard errors and then increase slowly to 100%. When I have done this, I’ve been able to see problems when parameter values go awry. It is not a fail-safe diagnostic in any way, but it can help. Cheers, Tony Anthony Ragnar Ives Department of Zoology UW-Madison Madison, WI 53706 608-262-1519 On Mar 1, 2015, at 9:42 PM, Liam J. Revell liam.rev...@umb.edu wrote: Hi Andrea. This is not presently implemented, but since this is a likelihood method it would be straightforward to constrain to a slope of zero and then do a LR test. This would be probably be the easiest way to test a hypothesis about the regression. That being said, as noted in the function documentation, some problems have been reported with the optimization algorithm for this model, which is simple and thus may fail to find the ML solution. Consequently, I would encourage you to look for other implementations of the method so that you can be confident in your result. I'm not aware of one in R at this time. All the best, Liam Liam J. Revell, Assistant Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 3/1/2015 10:31 PM, Andrea Berardi wrote: Hi all, I'm just learning how to do PGLS analyses, and I'm
Re: [R-sig-phylo] MCMC converged or not im MCMCglmm
There are no tests that can conclusively prove chain convergence. You are correct to look at the coda package. There you will find functions that give an approximate test. See ?gaweke.diag and ?gelman.diag. Cheers, Simon. On 30/08/13 00:22, Sereina Graber wrote: Hi everyone, I am doing simulations of phylogenetic data and subsquently use MCMCglmm() to analyse this simulated data. Now very often, the data does not converge even if I drastically increasing the number of iterations. Now I would like to save when the MCMC converged and when not? I have read about this coda package in R, however, a lot of different things seems to be possible with that package. I simply wanna have a function which tells me whether my chain converged or not that I can save that with a simple yes or no in my simulation output. Does anyone know a simple funciton to do that? For any help I am very grateful. Best, Sereina -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Senior Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Statistics is the grammar of science - Karl Pearson. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] PGLS on a sample of trees
This is easy to do using Bayesian methods. See: http://www.biomedcentral.com/content/pdf/1471-2148-12-102.pdf Cheers, Simon. On 14/03/13 22:25, Ferguson-Gow, Henry wrote: Hi I used the method of Kuhn et al (2011) to resolve polytomies in my tree leaving me with a posterior of thousands of trees. I was wondering how I would go about using PGLS on a sample of these trees so that the uncertainty in the resolution of the polytomies is incorporated into my analysis (i.e. some kind of combined coefficients and confidence intervals taken from multiple PGLS tests). This seems preferable to using either a tree summarised from the posterior or the initial tree complete with polytomies/arbitrarily resolved polytomies. Thanks Henry [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.evolutionarystatistics.org Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] tips to root brach length
try diag(vcv.phylo(your.tree)) Cheers, Simon. On 03/09/12 19:17, boyang zhe wrote: Hi, everyone I am new to ape and R programming. I have an unrooted tree. I use root() function to reroot the tree by one outgroup. and then I try to extract root to tip distance. the tree$edge.length only store the brach length between each tip and corresponding internal node. So how I extract the distance between root and the tips? Thanks first! Boyang ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] possible issue when incorporating a phylogenetic correlation structure (corPagel) in a linear mixed effect model (lme)
= corPagel(lambda[i],tree,fixed=TRUE)) loglike.t1[i] - m1$logLik u = gl(1,1,length(d$y)) m2 - lme(y ~ x,random = list(u = pdIdent(form=~factor(date)-1)), correlation = corPagel(lambda[i],tree,fixed=TRUE)) loglike.t2[i] - m2$logLik } The two curves are completely different: With t2, we obtain a reasonable curve, with a maximum at the previously estimated lambda value. With t1, the curve looks really strange: there is a discontinuity at the origin, i.e., for lambda=0 the log-like value is higher than for lambda0, and for lambda0 the log-like is only increasing with lambda. Thus a third question: why are these two profile log-likelihood curves different? The final question is: in which technique can we believe? I’m using the version 2.14.1 of R, 3.1-96 of nlme, and 3.0-1 of ape. Best, Rudolf -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] possible issue when incorporating a phylogenetic correlation structure (corPagel) in a linear mixed effect model (lme)
Fixed effects: y ~ x Value Std.Error DF t-value p-value (Intercept) -329242.7 324363.4 438 -1.01504 0.3106 xY 3.7 0.0 438 127.43249 0. Correlation: (Intr) xY 0.014 Second question: how is that possible? The two outputs should be the same. To try to understand what is going on, we can compute the profile log-likelihood curve, as a function of lambda, for both techniques. lambda - seq(0,1,0.01) loglike.t1 - rep(NA,length(lambda)) loglike.t2 - rep(NA,length(lambda)) for (i in 1:length(lambda)){ m1 - lme(y ~ x, random = ~1|date, correlation = corPagel(lambda[i],tree,fixed=TRUE)) loglike.t1[i] - m1$logLik u = gl(1,1,length(d$y)) m2 - lme(y ~ x,random = list(u = pdIdent(form=~factor(date)-1)), correlation = corPagel(lambda[i],tree,fixed=TRUE)) loglike.t2[i] - m2$logLik } The two curves are completely different: With t2, we obtain a reasonable curve, with a maximum at the previously estimated lambda value. With t1, the curve looks really strange: there is a discontinuity at the origin, i.e., for lambda=0 the log-like value is higher than for lambda0, and for lambda0 the log-like is only increasing with lambda. Thus a third question: why are these two profile log-likelihood curves different? The final question is: in which technique can we believe? I’m using the version 2.14.1 of R, 3.1-96 of nlme, and 3.0-1 of ape. Best, Rudolf -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] query: how to use an existing covariance matrix directly in a gls procedure in R?
I used to do this before ape existed. Here is some example code. library(ape) data(bird.orders) # some made up data dat - data.frame(y=rnorm(23), x=rnorm(23)) rownames(dat) - bird.orders$tip.label mat - vcv(bird.orders, cor=TRUE) fit - gls(y~x, correlation=corSymm(mat[lower.tri(mat)], fixed=TRUE), data=dat) #assuming ultrametric tree # compare with corBrownian structure: fit2 - gls(y~x, correlation=corBrownian(phy=bird.orders), data=dat) # are the regression coefficients the same? all.equal(coef(fit), coef(fit2)) [1] TRUE I hope this helps, Simon. From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] On Behalf Of Tomlinson, Kyle [kyle.tomlin...@wur.nl] Sent: Friday, July 08, 2011 5:30 AM To: 'r-sig-phylo@r-project.org' Subject: [R-sig-phylo] query: how to use an existing covariance matrix directly in a gls procedure in R? Dear Dr Paradis I trust that I am allowed to contact you directly like this? If not, my sincere apologies. I have constructed the covariance matrix for a phylogenetic tree (using the vcv() command), which i would like to use directly in a gls() procedure, instead of using the tree directly as appears to be done in ape (because I only use a very small part of the tree i have constructed..). {I am doing this in order to check my own gls procedure and the gls procedure of PHYLOGr which I dont trust.} Do you know if this can be done? I have tried to figure it out by considering the corStruct class options in nlme, but none of these options seems to allow one to directly input an existing covariance matrix, and I simply dont understand the object construction methods of R well enough to build a suitable corStruct object myself. I hope you can help. sincerely Kyle Tomlinson Resource Ecology Group Centre for Ecosystem Studies Wageningen University Droevendaalsesteeg 3a 6708 PB Wageningen The Netherlands Phone: +31 317 485314 Fax: +31 317 484845 email: kyle.tomlin...@wur.nl [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] comparative analysis using multiple regression of contrasts?
On 24/05/11 01:38, Julien Claude wrote: Dear all, I have one factor and several covariates and I would like to know which of these explanatory variables are more likely to explain a response variable in a comparative analysis. The factor is a dummy variable (0-1). At first, my strategy would be to use contrasts of all variables and then producing a type II anova on contrasts, I am however not sure that I am right in doing so with the dummy variable. data(bird.orders) Y-rnorm(23) X1-rbinom(23,1,0.5) X2-rnorm(23) pY-pic(Y, bird.orders) pX1-pic(X1, bird.orders) pX2-pic(X2, bird.orders) library(car) Anova(lm(pY~pX1*pX2-1)) Can we directly use the contrast for the dummy variable?, or should we transform this contrast in specifying some special stuff behind the expected ancestral values (The dummy variable probably does not follow a brownian motion model...). The solution may be here but I have no clear idea about how to transform it. Yes, you can directly use the contrasts of the dummy variable. There is no problem with non-Brownian explanatory variables in pic regression. The usual regression model assumes all the covariance is in the response variable only. The calculation of contrasts for the explanatory variable is a necessary step to getting the correct parameter estimates. No further meaning should be attached. (Although perhaps I hold an extreme view on this.) I wonder also how to adopt this strategy by using gls. DF.bird- data.frame(X1,X2, Y) bm.bird- corBrownian(phy = bird.orders) m1- gls(Y~X1*X2, correlation = bm.bird, data = DF.bird) m2- gls(Y~X2*X1, correlation = bm.bird, data = DF.bird) anova(m1) How to get the mean squares and variance estimate in pgls ? With a one factor analysis F-values are exactly the same, but when the number of covariates is greater than 1, F values can differ in some extent, I wonder why. that's because by default, the analysis gives you sequential sums of squares tests (Type I for the SAS people). To get type II (marginal) tests, which are not affected by the order of fitting, use anova(m1, type=marginal) In the gls stuff, I understand that incorporating the expected correlation of the response should follow a BM, but this is certainly not true for all the predictors)? It doesn't matter for the predictor variables. The covariances of the predictor variables is not used in the calculation of the gls regression. If you want to take the covariances of the predictors into account, you need to use an errors-in-variables model. Or you could calculate non-central correlations (correlation through the origin) instead of regression. Simon. Let me know if this strategy makes sense or if it is flawedThanks for your input Julien --- Julien CLAUDE Institut des Sciences de l'Evolution, 34095 Montpellier cedex 5 Université de Montpellier 2 ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] phylogenetically-informed Reduced Major Axis regression in R?
I think it is important to point out, that while RMA may superficially be an attractive method, it relies on the ratio of error variances being unity. This is almost always incorrect. It usually results in a massive over-correction of the slope bias with respect to the OLS estimator. That is, the slope is made much too steep. I would not encourage anyone to use RMA for anything other than in the case where there is sufficient within-species replication to estimate the error variances with some precision, and then use an appropriate generalization of RMA that allows for the variance ratio to be other than unity. Fiddling around with phylogenetically-informed RMA is like rearranging the deck chairs on the Titanic. The problem is discussed in depth in: R. J. Carroll and D. Ruppert 1996, The Use and Misuse of Orthogonal Regression in Linear Errors-in-Variables Models. The American Statistician, Vol. 50, No. 1, pp. 1-6 Carroll et al. 2006, Measurement Errors in Nonlinear Models. A Modern Perspective. 2nd Edition, Chapman Hall. Chapter 3. Simon. This is On 21/04/11 01:13, Joe Felsenstein wrote: Liam said: Just calculating the slope is straightforward. For tree and column vectors x y (in order tree$tip.label): The relevant point to keep in mind is that once you have made maximum likelihood estimates of the means, variances and covariances of the variables, the Reduced Major Axis is simply a function of these, and its ML estimate is that function of the ML estimates of the covariances. You don't need to do any separate ML estimation for the RMA. If you want to test hypotheses about the RMA, if you can recast them as hypotheses about the slopes and correlations (say that the slope is zero) then the test can be done there, and no separate test of the RMA is needed. In the next release of my program Contrast in PHYLIP, I will have an option to print out the RMA and its other axes, which did not involve anything more complicated than computing them from the covariances that it was already estimating. Joe Joe Felsenstein, j...@gs.washington.edu Dept. of Genome Sciences, Univ. of Washington Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Model-Selection vs. Finding Models that Fit Well
-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Luke Harmon Assistant Professor Biological Sciences University of Idaho 208-885-0346 lu...@uidaho.edu ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] ape-package/ possibility to get the transformed residuals (phylogentic corrected residuals)?
Actually, the function residuals.gls calculates 3 different types of residuals: response or raw residuals, pearson or standardized residuals, and normalized residuals. For diagnostic purposes, you probably want the normalized residuals. The default is the response residuals (although the ?residuals.gls says that the pearson residuals are the default. This seems to be a bug in the help file). See ?residuals.gls Also, it's better programming practice to use the extractor function (residuals.gls in this case), rather than using result$residuals directly, since the contents of slots can change on the whim of the developer, but the extractor function should, in general, return what you ask for. Cheers, Simon. On 13/10/10 03:40, Liam J. Revell wrote: Hi Jan: Yes. gls() returns an object of class gls which contains a vector containing the residuals as one of its components. So for instance you might compute: result-gls(y~x,data=test.data,correlation=corPagel(phy=tree,value=1,fixed=FALSE)) in which case: result$residuals contains the model residuals. You can also get the residuals using the function resid(), i.e.: resid(result) Residual standard error is: result$sigma so residual sum of squares should just be: resid.SS-n*result$sigma^2 for n species. Hope this is helpful. - Liam Liam J. Revell NESCent, Duke University web: http://anolis.oeb.harvard.edu/~liam/ NEW email: lrev...@nescent.org Werner, Jan wrote: Dear R-sig-phylo members, for some analyses I used the gls and gnls functions (package nlme) and controlled for phylogenetic effects by the phylogenetic correlation structure corPagel provided by the ape-package. My question is, is there a possibility to get the transformed residuals (phylogentic corrected residuals) or the phylogenetic corrected residual sum of squares? Many thanks for your efforts. Best regards Jan Werner ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] PIC vs. PGLS
) To: Anne Kempelkem...@ips.unibe.ch Cc: r-sig-phylo@r-project.org Hi Anne, You should build your model based on your scientific hypothesis - not on which trait shows phylogenetic signal. However, GLS corrects for non-independence in the residual error of y given X - non-independence which may be due to (for instance) phylogenetic history. Incidentally, if our observations for X are non-independent due to the phylogeny, but the residual error in y given X is uncorrelated, than GLS is not necessary (and will actually give us an estimate with inflated variance). I have a paper about exactly this topic that was recently published online in the new journal Methods in Ecology and Evolution. The citation is: Revell, L. J. 2010. Phylogenetic signal and linear regression on species data. /Methods in Ecology and Evolution/ Online Early View. And the article can be found at the following URL: http://dx.doi.org/10./j.2041-210X.2010.00044.x or on my website (URL below). I hope this is of some help. - Liam Liam J. Revell NESCent, Duke University web: http://anolis.oeb.harvard.edu/~liam/ NEW email: lrev...@nescent.org ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] r-squared for PGLS regression
A good general reference for r^2 is: @ARTICLE{generalized.r^-1, author = {Nagelkerke, N. J. D.}, title = {A Note on a General Definition of the Coefficient of Determination}, journal = {Biometrika}, year = {1991}, volume = {78}, pages = {691-692}, number = {3} } Simon. Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem Statistics is the grammar of science - Karl Pearson. -Original Message- From: r-sig-phylo-boun...@r-project.org on behalf of Anthony R Ives Sent: Sat 10/10/2009 9:40 AM To: Ramona Walls Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] r-squared for PGLS regression Dear Ramona, There are multiple ways that you could calculate an R2 for GLS. The best form of R2 is (in matlab code) R2=1-(e'*invV*e)/((Y-a)'*invV*(Y-a)) where e are the residuals, invV is the inverse of the covariance matrix, Y contains the data, and a is the GLS mean of Y, a=(ones(n,1)'*invV*ones(n,1))\(ones(n,1)'*invV*Y) But this is not a perfect measure and cannot be directly compared to R2 calculated from LS. This is discussed in the last paragraph of the supplement (p.546) to Lavin, S. R., W. H. Karasov, A. R. Ives, K. M. Middleton, and T. Garland, Jr. 2008. Morphometrics of the avian small intestine, compared with non-flying mammals: a phylogenetic approach. Physiological and Biochemical Zoology 81:526-550. I will send you a reprint in a separate email. I hope this helps. Cheers, Tony On Oct 9, 2009, at 2:58 PM, Ramona Walls wrote: I am using the ls function in nlme to conduct PGLS regression, with a correlation structure based on the maximum likelihood value of lambda (this seems to be the best-fitting model of evolution for my data). Unlike the lm function, ls does not return r-squared values. I suspect this may be because computing r-squared with an atypical error-variance matrix is not so straightforward. I tried to calculate r-squared myself, based on the residuals from the PGLS regression and standard formula (SS explained by regression divided by total SS), but the number I got back was much higher than I expected. I think I am using the right formula, because if I calculate r-squared from the ls regression of the same data using a regular error matrix, I get the same value as what is returned when I do a a regression using the lm function. Is there a way to calculate r- squared for PGLS regression? Do I need to use different estimates of the mean value of Y, because of the phylogenetic correction? Does it matter that my data have been log transformed? Can anyone provide me with R code to do this. Thank you! _ Ramona Walls Ph.D. Department of Ecology and Evolution Stony Brook University Stony Brook, NY 11794-5245 rwa...@life.bio.sunysb.edu ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Anthony Ragnar Ives Department of Zoology UW-Madison (608) 262-1519 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Correcting for body size using PGLS
On Tue, 2009-07-21 at 04:44 -0400, Eli Swanson wrote: Hi all, I'm using gls for an analysis I'm working on. I'm trying to 'remove' body size from another variable. I was concerned about underestimating the allometric slope using gls so I calculated contrasts and fit an RMA regression through the origin to see if this was in fact occurring. The slope using RMA was in fact larger. (RMA: 1.98 (lower CI=1.5, upper=2.4)) (gls: 1.36 +/- 0.27 SE) Am I right in thinking that the RMA slope is most likely a better estimate of the relationship? Ideally, is there an RMA method for pgls? I think the answer is no to this question after reading the archives and sources given therein, but I'm not totally sure. My understanding is that this wouldn't even make sense because with RMA neither variable is 'causal', but in gls, one variable is in fact taken to be. There is indeed a GLS version of RMA (or Total Least Squares or functional regression or orthogonal regression - there are many names for this technique as it has been re-invented several times in different disciplines). It is called Generalised Total Least Squares. See Markovsky and Van Huffel (2007) Signal Processing 87:2283–2302 for a review of total least squares methodology, including generalised and weighted TLS. However, I would not recommend using this technique unless you have a way of measuring the measurement error variance and the equation error variance, as their ratio is assumed to be known and fixed. If you have replicate measures for each species, or at least standard errors from the literature (Ives et al. 2007), then it can be done. But if you don't have within-species replication (and especially if you assume the ratio is unity), then you have the potential to massively over-correct the regression slope. See, p. 57-60 in Measurement Error in Nonlinear Models by Carrol, Ruppert, Stefanski and Crainiceanu (2006), which is a very fine, modern introduction to measurement error problems. They are quite pessimistic about the general utility of the orthogonal regression technique. Over the past month or so, a student and I have been working on this problem in the context of plant trait functional relationships. Based on some code in the above book, we have been using Bayesian methods to fit latent trait regressions, where both latent variables are multivariate normal with a known phylogenetic variance-covariance matrix (which can differ for each trait). This works because we have replicate measurements (ranging from 2 to 13) on each species and trait, so we can estimate the error variances. The phylogenetic information enters the model as a prior on each latent trait. Is there any other way to use the RMA slope to obtain residuals for the gls analysis? If there's no way to use RMA AND a phylogenetic correction, am I better off underestimating the slope, or foregoing the phylogenetic correction? If you have no way to estimate the error variance ratio, I would not use RMA, but keep the phylogenetic information in the model by just sticking with GLS. The negative bias will probably be much less than the positive over-correction bias induced by the RMA analysis. Cheers, Simon. In this case, estimating lambda, or setting it to one both result in a much better model fit than setting it to 0, so I think that the inclusion of phylogeny is important, just not sure. If my questions are totally off base or naive, could someone point me in the right direction? I read Ives et al. 2007, and I feel like it probably provides an answer to my question, but it was honestly over my head. Thanks in advance! Cheers, Eli Swanson ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician School of Biological Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] writing list of trees to file
Hmm. I should try solutions before I post them. You need to make sure that each tree in the list is of class phylo too. This works: phylist-read.tree(text=A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00);A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00);) newnames-c(G,H,I,J,K) newlist - lapply(phylist, function(z) { z$tip.label - c(G,H,I,J,K) class(z) - phylo z }) class(newlist) - multiPhylo write.tree(newlist,file=newlist) On Mon, 2008-09-01 at 10:55 +1000, Simon Blomberg wrote: Try class(newlist) - multiPhylo Then use write.tree. Cheers, Simon. On Sun, 2008-08-31 at 20:39 -0400, [EMAIL PROTECTED] wrote: Hi all, I have hit an obstacle and I hope someone will know a quick fix. I want to read a list of trees, do something to those trees and then write them to a file. The list is seen as multiPhylo until I apply some function then it becomes a list that I cannot write to a file with write.tree. I put an example below, where I read in two trees, and then use a function to change the tip names and then try to write the trees to a file. Thanks in advance for your help! Stacey phylist-read.tree(text=A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00); A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00);) class(phylist) [1] multiPhylo newnames-c(G,H,I,J,K) newlist - lapply(phylist, +function(z) { +z$tip.label - c(G,H,I,J,K) +z +}) write.tree(newlist,file=newlist) Error in write.tree(newlist, file = newlist) : object phy is not of class phylo (and yes, write.tree did work on the multiphylo object before I did the function) ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] writing list of trees to file
Try class(newlist) - multiPhylo Then use write.tree. Cheers, Simon. On Sun, 2008-08-31 at 20:39 -0400, [EMAIL PROTECTED] wrote: Hi all, I have hit an obstacle and I hope someone will know a quick fix. I want to read a list of trees, do something to those trees and then write them to a file. The list is seen as multiPhylo until I apply some function then it becomes a list that I cannot write to a file with write.tree. I put an example below, where I read in two trees, and then use a function to change the tip names and then try to write the trees to a file. Thanks in advance for your help! Stacey phylist-read.tree(text=A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00); A:0.21,B:0.21):0.28,C:0.49):0.13,D:0.62):0.38,E:1.00);) class(phylist) [1] multiPhylo newnames-c(G,H,I,J,K) newlist - lapply(phylist, +function(z) { +z$tip.label - c(G,H,I,J,K) +z +}) write.tree(newlist,file=newlist) Error in write.tree(newlist, file = newlist) : object phy is not of class phylo (and yes, write.tree did work on the multiphylo object before I did the function) ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo