Re: [R-sig-phylo] read.nexus error message
Hi Simon, I have tried with the first 2500 species and it worked fine. Are you sure to have the last version of ape? The file output by this server is a NEXUS file, so use read.nexus, not read.tree. Best, Emmanuel Simon Ducatez wrote on 07/12/2012 01:06: Dear all, I am having troubles importing a nexus file into R. The file contains 100 different trees (containing over 2000 species) from a pseudo-posterior distribution (imported from the website http://birdtree.org/), and I got the same error message whether using read.nexus or read.tree: Error in if (tp[3] != ) obj$node.label- tp[3] : missing value where TRUE/FALSE needed The same error was mentioned before on the web, but I couldn’t find any solution, and I do not understand the message. The different trees seem OK on Mesquite. Note that when using the same website to extract 100 trees, but this time for a sample of 50 species, the importation works without problem. Any help would be greatly appreciated! Thank you very much in advance Best Simon [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Emmanuel Paradis IRD, Jakarta Visiting Professor, Agricultural University of Bogor http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Null models and weighted abundance in picante MPD
Hi, I'm using picante to calculate the MPD and MNTD of samples based on a bacterial 16S OTU phylogeny and a community data matrix. I have OTUs clustered at 97 and 99% identity and the community matrix contains total number of sequences for each OTU in each sample. I want to see how the diversity changes within the samples I was wondering if the abundance weighted option in picante is at all applicable when calculating MPD/MNTD for this type of data. I see from the picante manual that using abundances changes the interpretation to the mean phylogenetic distances among individuals, but I don't understand exactly how the abundances are used for that. The underlying difference between sequences within an OTU cluster is hidden from picante so it feels to me that weighting by abundance will not give meaningful results in this case. If someone could clarify this I would really appreciate it. Another thing I was wondering about are the different methods for creating the null community (randomization methods). The taxa.labels method randomizes the distance matrix, whereas all other methods randomize the community matrix in some way. I've been trying out the different methods and I get almost identical results with the taxa.labels, sample.pool, phylogeny.pool and richness methods but overall higher values with the frequency, independentswap and trialswap methods (goes for both the mpd.obs.z and mpd.obs.p values). However, the general pattern across samples is the same for all methods. I find it difficult to distinguish between the different methods that randomize the community matrix and to know which one I should choose. It's somewhat comforting that the pattern across the samples doesn't change with the methods, but the values themselves change substantially. Any help at all on this, or pointers to further information would be great. Thanks! Cheers, John ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Null models and weighted abundance in picante MPD
Hi John, I was wondering if the abundance weighted option in picante is at all applicable when calculating MPD/MNTD for this type of data. I see from the picante manual that using abundances changes the interpretation to the mean phylogenetic distances among individuals, but I don't understand exactly how the abundances are used for that. The underlying difference between sequences within an OTU cluster is hidden from picante so it feels to me that weighting by abundance will not give meaningful results in this case. If someone could clarify this I would really appreciate it. The interpretation of the abundance-weighted measures is that they are the expected phylogenetic distance among two randomly sampled individuals from a community, versus two randomly sampled species or OTUs from the community for the non-abundance-weighted metrics. For your data set it sounds like the interpretation of the abundance-weighted measures would be that if you drew two random sequences from a sample, what's the expected phylogenetic similarity of those sequences? If OTUs differ in abundance (they almost certainly do), that's a very different question from asking about drawing two randomly selected OTUs from the community and comparing their relatedness. Picante assumes there is no variation within OTUs/species when calculating abundance-weighted MPD/MNTD (within-species phylogenetic distances are counted as zero distance). If you are using an OTU definition that encompasses a lot of intra-OTU variation this might not be a good assumption. I find it difficult to distinguish between the different methods that randomize the community matrix and to know which one I should choose. It's somewhat comforting that the pattern across the samples doesn't change with the methods, but the values themselves change substantially. Any help at all on this, or pointers to further information would be great. There is a huge literature on using and choosing null models in ecology. Here are two papers that discuss null model choice in the context of studies of phylogenetic relatedness, with citations to more general papers on the subject: Hardy, O. J. (2008), Testing the spatial phylogenetic structure of local communities: statistical performances of different null models and test statistics on a locally neutral community. Journal of Ecology, 96: 914–926. Kembel, S. W. (2009). Disentangling niche and neutral influences on community assembly: assessing the performance of community phylogenetic structure tests. Ecology Letters, 12(9), 949–60. -Steve -- Steven W. Kembel Professeur régulier Département des sciences biologiques Université du Québec à Montréal kembel.steve...@uqam.ca +1 (514) 987-3000 poste 5855 http://www.phylodiversity.net/skembel/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Null models and weighted abundance in picante MPD
Hi Steve, Thank you for the reply. The level of OTU clustering is of course rather arbitrary when it comes to bacteria and it sounds like the abundance weighted MPD/MNTD is not a good idea in my case. I guess that one option would be to use phylogenies with all sequences, not just type-sequences of OTU clusters. Since I have two cutoffs I'm thinking it would be good to look further at OTU counts and diversity at theses different levels to get some quick sense of the variation within OTUs in different samples. I'll read up on those references as well. Again, thanks for your answer. /john On 12/07/2012 05:05 PM, Steven Kembel wrote: Hi John, I was wondering if the abundance weighted option in picante is at all applicable when calculating MPD/MNTD for this type of data. I see from the picante manual that using abundances changes the interpretation to the mean phylogenetic distances among individuals, but I don't understand exactly how the abundances are used for that. The underlying difference between sequences within an OTU cluster is hidden from picante so it feels to me that weighting by abundance will not give meaningful results in this case. If someone could clarify this I would really appreciate it. The interpretation of the abundance-weighted measures is that they are the expected phylogenetic distance among two randomly sampled individuals from a community, versus two randomly sampled species or OTUs from the community for the non-abundance-weighted metrics. For your data set it sounds like the interpretation of the abundance-weighted measures would be that if you drew two random sequences from a sample, what's the expected phylogenetic similarity of those sequences? If OTUs differ in abundance (they almost certainly do), that's a very different question from asking about drawing two randomly selected OTUs from the community and comparing their relatedness. Picante assumes there is no variation within OTUs/species when calculating abundance-weighted MPD/MNTD (within-species phylogenetic distances are counted as zero distance). If you are using an OTU definition that encompasses a lot of intra-OTU variation this might not be a good assumption. I find it difficult to distinguish between the different methods that randomize the community matrix and to know which one I should choose. It's somewhat comforting that the pattern across the samples doesn't change with the methods, but the values themselves change substantially. Any help at all on this, or pointers to further information would be great. There is a huge literature on using and choosing null models in ecology. Here are two papers that discuss null model choice in the context of studies of phylogenetic relatedness, with citations to more general papers on the subject: Hardy, O. J. (2008), Testing the spatial phylogenetic structure of local communities: statistical performances of different null models and test statistics on a locally neutral community. Journal of Ecology, 96: 914926. Kembel, S. W. (2009). Disentangling niche and neutral influences on community assembly: assessing the performance of community phylogenetic structure tests. Ecology Letters, 12(9), 94960. -Steve [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] issue with cophenetic.phylo()
Hi Everyone, I have been using the cophenetic.phylo() function and I have been having some issues with pairs of tips having distances that are different than what I would expect. The following code illustrates the issue (I have set the seed in order to reproduce these results). I generate a tree with 10 tips. There should be 9 unique non-zero distances in the matrix returned by cophenetic.phylo(). Instead there are 10 because 1.22313194 is duplicated: set.seed(1723) tree=rcoal(10) x=cophenetic.phylo(tree) y=sort(unique(as.vector(x))) y: [1] 0. 0.04371057 0.16334724 0.22292128 0.31957449 0.33656861 [7] 0.35227827 0.84410552 1.22313194 1.22313194 2.14497976 As an example, t9 should have the same distance to t8 and t2. x[t9,]: t1t3t6t9t7t2t8t5 0.3522783 0.3365686 0.3365686 0.000 1.2231319 1.2231319 1.2231319 2.1449798 t10t4 2.1449798 2.1449798 But they are different: x[t9,t8]==x[t9,t2] [1] FALSE They are off by a very small amount: x[t9,t8]-x[t9,t2] [1] -2.220446e-16 I am posting this in case it is of interest to others. As a quick fix, I am now rounding x. Thanks, Kelly [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] issue with cophenetic.phylo()
Hi Kelly, where is the problem? This happens if you work with floating point numbers. There is a nice link on the developer.r-project on the topic www.validlab.com/goldberg/paper.pdf The value is on my machine actually the same as .Machine$double.eps [1] 2.220446e-16 which is the smallest positive floating-point number ‘x’ such that ‘1 + x != 1’. Regards, Klaus On 12/7/12, Kelly Burkett kelly.burk...@mail.mcgill.ca wrote: Hi Everyone, I have been using the cophenetic.phylo() function and I have been having some issues with pairs of tips having distances that are different than what I would expect. The following code illustrates the issue (I have set the seed in order to reproduce these results). I generate a tree with 10 tips. There should be 9 unique non-zero distances in the matrix returned by cophenetic.phylo(). Instead there are 10 because 1.22313194 is duplicated: set.seed(1723) tree=rcoal(10) x=cophenetic.phylo(tree) y=sort(unique(as.vector(x))) y: [1] 0. 0.04371057 0.16334724 0.22292128 0.31957449 0.33656861 [7] 0.35227827 0.84410552 1.22313194 1.22313194 2.14497976 As an example, t9 should have the same distance to t8 and t2. x[t9,]: t1t3t6t9t7t2t8 t5 0.3522783 0.3365686 0.3365686 0.000 1.2231319 1.2231319 1.2231319 2.1449798 t10t4 2.1449798 2.1449798 But they are different: x[t9,t8]==x[t9,t2] [1] FALSE They are off by a very small amount: x[t9,t8]-x[t9,t2] [1] -2.220446e-16 I am posting this in case it is of interest to others. As a quick fix, I am now rounding x. Thanks, Kelly [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Klaus Schliep Phylogenomics Lab at the University of Vigo, Spain ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Analysis with Multiple cores on Mac Workstation
Dear all, I want to create a consensus tree with branch lengths (Brian O'Meara's function on post *[R-sig-phylo] Why no branch lengths on consensus trees?*) using a Mac Workstation. However, if I only type the function on it, R will not use all cores for running the analysis. I would like to know if there is a function or any way to divide the analysis within the cores, or to use all cores for running the program. I know this is not the best forum to ask something like that (using multiple cores), but I imagined that someone might have the solution for that as some of you work with large databases. Best, José Hidasi * This is Brian O'Meara's function, that i am using to create the consensus tree, on the post *[R-sig-phylo] Why no branch lengths on consensus trees?:* I have a function to create a consensus tree with branch lengths. You feed it a given topology (often a consensus topology, made with ape), then a list of trees, and tell it what you want the branch lengths to represent. It could be the proportion of input trees with that edge (good for summarizing bootstrap or Bayes proportions) or the mean, median, or sd of branch lengths for those trees that have that edge. Consensus branch lengths in units of proportion of matching trees has obvious utility. As Daniel says, the average branch lengths across a set of trees is more difficult to see a use case for, but you could imagine doing something like taking the ratogram output from r8s on a set of trees and summarizing the rate average and rate sd on a given, best, tree as two sets of branch lengths on that tree. I've put the function source at https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/consensusBrlen.R?revision=110root=omearalab . You can source the file for the function (consensusBrlen() ) and other functions it needs. It also uses phylobase. Note that this is alpha-quality code -- it's been checked a bit, but verify it's doing what you want. Here's an example of how to use it library(ape) library(phylobase) phy.a-rcoal(15) phy.b-phy.a phy.b$edge.length-phy.b$edge.length+runif(length(phy.b$edge.length), 0, 0.1) phy.c-rcoal(15) phy.list-list(phy.a, phy.b, phy.c) phy.consensus-consensusBrlen(phy.a, list(phy.a, phy.b, phy.c), type=mean_brlen) -- José Hidasi Neto Graduated in Biological Sciences - Universidade Federal de Goiás (UFG) Master's candidate in Ecology and Evolution - Community Ecology and Functioning Lab - UFG Lattes: http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4293841A0 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] interpreting phylogenetic signal
I calculated Blomberg's K using 'Kcalc' in 'picante' for a continuous trait and a phylogeny of 65 species. The K value is 1.079 and when I do the randomization test I get the following results. I'm a bit confused by such a large PIC.variance.rnd.mean = 8.031. Should I be drawing conclusions by P = 0.0009 instead? Rejecting the NULL of no phylogenetic signal? # K PIC.variance.obs PIC.variance.rnd.mean PIC.variance.P PIC.variance.Z # 1.079533 1.274221 8.0318440.000999001 -3.351957 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/