Re: [R-sig-phylo] Iterating though multiple FASTA files via rbind.DNAbin
Hi Jarrett, This has been working for me using the package ‘apex': x <- read.multiFASTA(files) # creates a multiDNA object genes <- x@dna[] # creates a list with your loci. I hope this helps. Best Gustavo Em qui., 12 de mar. de 2020 às 11:18, Jarrett Phillips < phillipsjarre...@gmail.com> escreveu: > Hi All, > > I have a folder with multiple FASTA files which need to be read into R. > > To avoid file overwriting, I use ape::rbind.DNAbin() as follows: > > file.names <- list.files(path = envr$filepath, pattern = ".fas") > tmp <- matrix() > for (i in 1:length(file.names)) { > seqs <- read.dna(file = file.names[i], format = "fasta") > seqs <- rbind.DNAbin(tmp, seqs) > } > > When run however, I get an error saying that the files do not have the same > number of columns (i.e., alignments are all not of the same length). > > How can I avoid this error. I feel that it's a basic fix, but one that is > not immediately obvious to me. > > Thanks! > > [[alternative HTML version deleted]] > > ___ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at > http://www.mail-archive.com/r-sig-phylo@r-project.org/ > -- *Gustavo Silva de Miranda* Peter Buck Postdoctoral Fellow - GGI <https://ggi.si.edu/> Department of Entomology National Museum of Natural History Smithsonian Institution Personal website <http://gustavomiranda.weebly.com/> | Google Scholar <https://scholar.google.com.br/citations?user=dXEGc40J=en> | ResearchGate <https://www.researchgate.net/profile/Gustavo_Miranda4>| ORCID <http://orcid.org/-0002-2895-9331> | Publons <https://publons.com/a/637460/> | Curriculum Lattes <http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4270248D7> (PT) *Editor:* Check List <https://checklist.pensoft.net/> | A Bruxa <https://www.revistaabruxa.com/> [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] ape matrix manipulation
Hello, I am working with a more less a big phylogeny (over 100 taxa). I had used drop.tips to analyze some clades independently and produce sub-trees. My question is: how can I use the information in the modified trees to drop the same taxa in the original DNA matrix? I have been doing this outside of R but I am sure should be a better way to do this. Thanks Gustavo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] Fitzpatrick and Turelli 2006 - phylogenetic signal in range overlap
Hello everyone, By any chance, has anybody tried putting together an R code to perform the analyses of range overlap in a phylogenetic context proposed by Fitzpatrick and Turelli 2006? It is a nice way to incorporate measurements of pair-wise similarity into a phylogenetic context. Warren et al, 2008 have used it specifically to analyse phylogenetic patterns of niche overlap as a function of divergence time. I would be very interesting to find out more details about such code. Also, details about any other approach out there would be highly appreciated. Thanks a lot in advance. Gustavo Bravo LSU Museum of Natural Science ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Fitzpatrick and Turelli 2006 - phylogenetic signal in range overlap
Rob, Excellent! Thank you so much for the hint. Cheers, Gustavo On 4/7/11 8:15 PM, Rob Lanfear wrote: Hi Gustavo, This is implemented in the package phyloclim, and the function is age.range.correlation(). Here's the description from the help file: This function can be used to test for phylogenetic signal in patterns of niche overlap (Warren et al., 2008) based on the age-range correlation (ARC) as implemented by Turelli Fitzpatrick (2006). Cheers, Rob On 8 April 2011 02:03, Gustavo A. Bravo gbra...@tigers.lsu.edu mailto:gbra...@tigers.lsu.edu wrote: Hello everyone, By any chance, has anybody tried putting together an R code to perform the analyses of range overlap in a phylogenetic context proposed by Fitzpatrick and Turelli 2006? It is a nice way to incorporate measurements of pair-wise similarity into a phylogenetic context. Warren et al, 2008 have used it specifically to analyse phylogenetic patterns of niche overlap as a function of divergence time. I would be very interesting to find out more details about such code. Also, details about any other approach out there would be highly appreciated. Thanks a lot in advance. Gustavo Bravo LSU Museum of Natural Science ___ R-sig-phylo mailing list R-sig-phylo@r-project.org mailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Rob Lanfear Postdoc, Centre for Macroevolution and Macroecology, Research School of Biology, Australian National University Tel: +61 2 6125 7270 www.robertlanfear.com http://www.robertlanfear.com [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] Phylogenetic Regression in R (Grafen method)
Hello, I`m analysing data for 300 anuran species. I want to run an ANCOVA using phylogenetic regression method (Grafen, 1989). Does any one know about a function or script to do it? Thanks, Gustavo Paterno ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Phylogenetic Regression in R (Grafen method)
Dear all, Thank you very much for your fast reply. I think i need to explain in more details, sorry about that. My data: (FD) - response variable (continous) Male size (CRC) - explanatory variable (continuos) habitat (HA) - explanatory variable (categorical) my model: FD ~ CRC*HA I`m using Pyron (2011) phylogeny, so i dont have any politomy in my data. I did the pgls model with lambda=“ML, but i would like to now if any one know about the Grafen (1989) method in r. Plus, how should i display my data in a graph? The original data does not represent pgls analysis. However, in the way I understand, pgls does not give you contrasts, is that right? Thanks again, Gustavo Paterno I`m using Pyron phlogeny, so i dono have any politomy in my data estrutur On Dec 3, 2013, at 1:20 PM, Gustavo Paterno paterno...@gmail.com wrote: Hello, I`m analysing data for 300 anuran species. I want to run an ANCOVA using phylogenetic regression method (Grafen, 1989). Does any one know about a function or script to do it? Thanks, Gustavo Paterno ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Phylogenetic regression available in R (Grafen, 1989)
Hello all, I just got the confirmation that the Grafen method for phylogenetic regression (Grafen, 1989) was implemented in R ! The package “phyreg” has lots of details in help. Best wishes for all. Gustavo Paterno ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Phylogenetic regression available in R (Grafen, 1989)
Hi Emmanuel, thanks for information. I really think it might be possible to obtain very similar results with both packages (ape - phyreg) However, I`m not sure that the math is the same. It uses branch lengths to account for recog- nised phylogeny (which makes the errors of more closely related species more similar), and the single contrast approach to account for unrecognised phylogeny (that a polytomy usually represents ignorance about which exact binary tree is true, and so one higher node should contribute only one degree of freedom to the test). One dimension of flexibility in the branch lengths is fitted auto- matically. It incorporates independent contrasts and phylogenetic generalised least squares of current theory If one want to run a phylogenetic regression sensu (Grafen, 1989) i think its better to use phyreg package, since it was exclusivley design for this purpose, plus the regression output can be fully acessed with the funcion inf( your model). This function will give you information about DF lost, F-test, residuals and other stuff. This is a explantaion of degree of fredom calculation in phyreg: On some occasions, degrees of freedom are lost for phylogenetic reasons (Grafen 1989, section 3(e)). A whole node may be lost to the final test if the residuals of its daughter nodes are all zero in the long regression. This can happen for various reasons, most often when (i) the response is in fact binary, and so there is no variation in it below a node, or (ii) a categorical variable has so many values restricted to one part of the tree that a subset of its parameter values can adjust to render all the residuals zero in that part of the tree. That is called a node being lost in the denominator. The other possibility is that once the contrasts have been taken across each higher node, the design matrix for the model has lower rank than it did before, which is called losing a degree of freedom in the numerator (it is transferred to the denominator) Best wishes, Gustavo Paterno On Mar 7, 2014, at 6:06 AM, Emmanuel Paradis emmanuel.para...@ird.fr wrote: Hi Gustavo, Grafen's method is partially implemented in ape. The function corGrafen defines a correlation structure according to Grafen's method (see ?corClasses for all corStruct defined in ape). When used with nlme::gls this makes possible to estimate the parameter of the branch length transformation. There is no correction of the number of degrees of freedom in ape. So I guess that the results from ape::corGrafen and phyreg should be the same if the phylogeny is assumed to be known without error. Best, Emmanuel Le 05/03/2014 14:30, Gustavo Paterno a écrit : Hello Xavier, Thanks for your answer. I also might be wrong, but as far as I know, Ape has the function - compute.brlen which can calculate branch lengths throght Grafen method but can not do phylogenetic regression sensu (Grafen, 1989). Grafen method uses a different approach to control phylogeny (hanging on a Tree, see paper for details). PGLS uses a covariance matriz to correct for phylogeny, so your degree of fredon remains the same. In Grafen method you can lose degree of freedom depending on your species relatedness. In the other hand, Brunch and crunch functions can not deal with complex models (eg. with continuous and categorigal variables together as explanatory variables) while Grafen method can deal complex models and also work if you do not know your full phylogeny (when if you have politomies) Grafen method was only available in GLIM and SAS until feb 2014. In the way I see, his new package for R is the only one that can really run his full method - phylogenetic regression. Any one has more information about it? Best wishes, On Mar 5, 2014, at 10:04 AM, Xavier Prudent prudentxav...@gmail.com wrote: Hi Gustavo, Thanks for that notification, I may be wrong, but was'nt that method already implemented in the CAPER package by David Orme (function pgls, brunch, crunch)? If yes, then which new functionalities does phyreg bring? Regards, Xavier 2014-03-05 13:46 GMT+01:00 Gustavo Paterno aspessoasmu...@gmail.com: Hello all, I just got the confirmation that the Grafen method for phylogenetic regression (Grafen, 1989) was implemented in R ! The package phyreg has lots of details in help. Best wishes for all. Gustavo Paterno ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- --- Xavier Prudent Computational biology and evolutionary genomics Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme (MPI-PKS) Noethnitzer Str. 38 01187 Dresden Max Planck-Institute for Molecular Cell Biology and Genetics (MPI-CBG
Re: [R-sig-phylo] Installing diversitree on OS X Mavericks, R 3.1.0
Dear John and list, I have tried to use your tutorial, but without success. I am getting the following error: ld: library not found for -lquadmath clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [diversitree.so] Error 1 ERROR: compilation failed for package âdiversitreeâ * removing â/Library/Frameworks/R.framework/Versions/3.1/Resources/library/diversitreeâ I am not sure if it has any relation to the error, but after installing for the first time, when I try to run brew install gsl gcc fftw to check if the installation is there, I get the following error Warning: gsl-1.16 already installed, it's just not linked Warning: gcc-4.8.3 already installed, it's just not linked Warning: fftw-3.3.4 already installed, it's just not linked Also, when I run gfortran --version I get GNU Fortran (GCC) 4.8.2 I am having a really hard time installing diversitree on Mavericks. I hope there is a workaround or anything like that. Please let me know if you need any other information. Thank you all very much in advance for any help. Regards, Gustavo Burin Ferreira, Msc. Instituto de Biociências Universidade de São Paulo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Installing diversitree on OS X Mavericks, R 3.1.0
Dear Jonathan, the last solution worked perfectly! Thank you very very much! Best, *Gustavo Burin Ferreira, **Msc.* Instituto de Biociências Universidade de São Paulo Tel: (11) 98525-8948 On Tue, May 27, 2014 at 6:30 PM, Jonathan Chang jonathan.ch...@ucla.eduwrote: Actually I suppose the recommended course of action would be to just use the Snow Leopard version, which will probably work fine on Mavericks (and diversitree will have a pre-built binary from the R developers). First uninstall your current version of R with `sudo rm -rf /Library/Frameworks/R.framework/` then install this version: http://cran.rstudio.com/bin/macosx/R-3.1.0-snowleopard.pkg Jonathan On Tue, May 27, 2014 at 2:21 PM, Jonathan Chang jonathan.ch...@ucla.edu wrote: It looks like there's an issue with your install of Homebrew, since installing those packages should also link them into the proper directories. Try brew unlink gsl gcc fftw brew link gsl gcc fftw These should run without error. If you can't link those packages then it means you already have a previous non-Homebrew install of these software. You can force it with `brew link --overwrite gsl` etc. but this will probably lead to a horrifying frankeninstall and cause further problems down the road. The other issue you have is that the gfortran in your PATH reports version 4.8.2 but your (unlinked) Homebrew copy is version 4.8.3, which is the one that you actually want. Again, this indicates a previous non-Homebrew install of gfortran, which will cause software installs to fail in exciting and unpredictable ways. If you are comfortable with reinstalling all of your software from scratch I would recommend deleting everything in /usr/local/ to start from a blank slate. Jonathan On Tue, May 27, 2014 at 1:47 PM, Gustavo Burin Ferreira ariete...@gmail.com wrote: Dear John and list, I have tried to use your tutorial, but without success. I am getting the following error: ld: library not found for -lquadmath clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [diversitree.so] Error 1 ERROR: compilation failed for package âdiversitreeâ * removing â/Library/Frameworks/R.framework/Versions/3.1/Resources/library/diversitreeâ I am not sure if it has any relation to the error, but after installing for the first time, when I try to run brew install gsl gcc fftw to check if the installation is there, I get the following error Warning: gsl-1.16 already installed, it's just not linked Warning: gcc-4.8.3 already installed, it's just not linked Warning: fftw-3.3.4 already installed, it's just not linked Also, when I run gfortran --version I get GNU Fortran (GCC) 4.8.2 I am having a really hard time installing diversitree on Mavericks. I hope there is a workaround or anything like that. Please let me know if you need any other information. Thank you all very much in advance for any help. Regards, Gustavo Burin Ferreira, Msc. Instituto de Biociências Universidade de São Paulo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Exponential function on make.bd.t
Dear list, I am trying to create a likelihood function based on trees that were simulated with an exponential decay on speciation rates through time, and with constant extinction rates. I thought I could use make.bd.t from diversitree for this matter. However, it seems that this function doesn't accept exponential as a valid type of function (at least as far as I could dig into the code). Can you advise me on how should I proceed to implement this? I am still trying to get used to the way diversitree functions are coded, and couldn't figure this out. Thanks in advance for any help! Regards, *Gustavo Burin Ferreira, **Msc.* Instituto de Biociências Universidade de São Paulo Tel: (11) 98525-8948 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Exponential function on make.bd.t
Hi Richard, Thank you very much for the information! I will do my best to do it by myself, and will let you know if it worked or if I will need some more help. Thanks once again for the help! Cheers, On May 30, 2014 5:19 AM, Rich FitzJohn rich.fitzj...@gmail.com wrote: Hi Gustavo, This used to be possible, and you could possibly use an older version of diversitree to do what you want to do. Things changed about a year, I think. Unfortunately, integrating time-varying functions efficiently is quite hard when parameters vary as arbitrary R functions. I've never been very happy with the hacks and trade-offs to get that working (the issue is that the ODE integrators are all written in C and calling R functions from C incurs significant overhead over the usual R function calls plus you get no vectorisation advantages). I decided to weight the speed/flexibility trade-off in favour of speed and settled on a simpler (and much faster) approach where there are a few pre-built options that match some simple cases: https://github.com/richfitz/diversitree/blob/master/src/TimeMachine.cpp If you need something extra, just add it in there (and in TimeMachine.h), just modifying the general approach in (say) tm_fun_linear. You'll need to match the prototypes exactly, but you can pass in as many parameters as you need. Then add your function to the list in R/time-machine.R: https://github.com/richfitz/diversitree/blob/master/R/time-machine.R#L45-L49 which will organise parameter names and things like that. Then you just reference the function by the name there and everything should work. Once that's done, send me a pull request on github and I can incorporate it into the package. The functions that are currently there are just the first things that seem reasonable and are obviously not a complete list. I'm not actively working on diversitree at the moment, or I'd do this for you. But if you get into trouble let me know and I'll see if I can get it going. Cheers, Rich On 30 May 2014 05:10, Gustavo Burin Ferreira ariete...@gmail.com wrote: Dear list, I am trying to create a likelihood function based on trees that were simulated with an exponential decay on speciation rates through time, and with constant extinction rates. I thought I could use make.bd.t from diversitree for this matter. However, it seems that this function doesn't accept exponential as a valid type of function (at least as far as I could dig into the code). Can you advise me on how should I proceed to implement this? I am still trying to get used to the way diversitree functions are coded, and couldn't figure this out. Thanks in advance for any help! Regards, *Gustavo Burin Ferreira, **Msc.* Instituto de Biociências Universidade de São Paulo Tel: (11) 98525-8948 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] dist.nodes crashing with big trees
Hey David and Nick, thanks a lot for the quick responses! I think I wasn't very clear in the first e-mail. What I get is actually an error from within dist.nodes, not when calling it. I've tried to use chainsaw2 and in the beginning it appeared to be working quite well. However after some running time, I get the same (original) error that motivated me writing to the list: > *Error in double(nm * nm) : vector size cannot be NA* > *In addition: Warning message:**In nm * nm : NAs produced by integer > overflow* Digging into the functions called within chainsaw2, I found that at some point it uses the function get_max_height_tree, that calls dist.nodes and that's where I think the problem lies. The error I got now is almost exactly the same as I got from timeSliceTree (because both cases use dist.nodes): > *dist.nodes* > *function (x) * > *{* > *x <- reorder(x)* > *n <- Ntip(x)* > *m <- x$Nnode* > *nm <- n + m* > *d <- .C(dist_nodes, as.integer(n), as.integer(m), > as.integer(x$edge[, * > *1] - 1L), as.integer(x$edge[, 2] - 1L), > as.double(x$edge.length), * > *as.integer(Nedge(x)), double(nm * nm), NAOK = TRUE)[[7]]* > *dim(d) <- c(nm, nm)* > *dimnames(d) <- list(1:nm, 1:nm)* > *d**}* I tried changing the highlighted part to something like double(as.numeric(nm) * as.numeric(nm)), and when I try running it, I get the error I wrote on the first e-mail: > *Error in dist.nodes(tree) (from #7) : ** long vectors (argument 7) are > not supported in .Fortran* Thus, I think that to solve this problem some tweak in the C/Fortran code that is called within dist.nodes (from ape) might be required, but I have no expertise on that. So if someone can help me with that, I'll appreciate it! Thanks again for the help so far! Best, *Gustavo Burin Ferreira, **Msc.* Instituto de Biociências Universidade de São Paulo Tel: (11) 98525-8948 On Fri, Oct 16, 2015 at 5:06 PM, Nick Matzke <mat...@nimbios.org> wrote: > Hi! I re-did chainsaw at some point, now there is chainsaw2. However, > googling that gets you horror movies, so here is a link with example code: > > https://groups.google.com/d/msg/biogeobears/Jy9uYckOL7s/XuNZ0B3jAwAJ > > (the discussion there points out a rare case where this crashes, but for > most trees it should work fine) > > Cheers, Nick > > On Fri, Oct 16, 2015 at 2:17 PM, David Bapst <dwba...@gmail.com> wrote: > > > Hi Gustavo, > > > > I'm paleotree's author and maintainer. Just to be clear that I > > understand your problem, I believe you are saying that when you use > > timeSliceTree, you are getting an error that the internal call to > > dist.nodes is failing? Is that right? > > > > The first thought I have is that maybe the solution here is to avoid > > dist.nodes, as it is somewhat overkill. I use dist.nodes in that code, > > which I wrote in 2011, to get the distance of tips and nodes from the > > root. A better solution may now exist in another R package. I'd have > > to investigate (although maybe someone on the list can suggest one). > > > > The second thought I have is that there might be alternative functions > > that do something lie timeSliceTree in another R package. Off the top > > of my head, I recall that Nick Matzke had a similar, 'chainsaw' > > function, which you can find here and appears not to call dist.nodes: > > > > https://stat.ethz.ch/pipermail/r-sig-phylo/2011-July/001483.html > > > > Again, maybe someone on the list knows of a good alternative function. > > > > I'll try to give this more thought, but for now, maybe see if you can > > use Nick's function succesfully. Overall though, I've discovered the > > use of truly gigantic trees can often run into unexpected problems. > > > > Cheers, > > -Dave > > > > > > > > On Fri, Oct 16, 2015 at 12:47 PM, Gustavo Burin Ferreira > > <ariete...@gmail.com> wrote: > > > Dear list, > > > > > > I'm trying to perform a time travel in simulated phylogenies with both > > > extant and extinct species using the timeSliceTree function form the > > > paleotree package. My aim is to have the molecular phylogenies derived > > from > > > the complete phylogeny (attached) in different points in time. > > > > > > However, when I try that with big trees (bigger than 2 tips > total), I > > > get an error of integer overflow coming from the dist.nodes function. > > After > > > slightly tweaking the dist.nodes function (changing nm from integer to > > > numeric/double), I get the following message: > > > > > > Error in dist.nodes(tre