Re: [R-sig-phylo] Branch and Bound Maximum Parsimony
Matthew- In addition to Ross's suggestion of TNT, I would recommend that you consider options less commonly used by paleontologists. One option is Mr. Bayes, which can apply the Mk2 model from Lewis (2001) to morphological data. Another is Strataphy (Marcot and Fox, 2008) which can apply stratocladistics (if you are willing to work with the assumptions of that method; I realize some people have strong opinions against it). Note that both these methods require that autapomorphies are included. -Dave Bapst, UChicago On Mon, Mar 28, 2011 at 12:34 AM, Matthew Vavrek matt...@matthewvavrek.comwrote: Hello, although I know a lot of people have moved away from it, I was wondering if there is an implementation of a tree search function using maximum parsimony with a branch and bound option for R. I work in a lab that has been using PAUP forever, but I'm trying to see if R would be a viable alternative (especially since PAUP seems to be in some sort of limbo, from what I can gather on the intertubes). However, maximum parsimony is typically what is used, and usually the datasets that were being used could be run in a relatively short (ie 15 minutes) time span in PAUP using a branch-and-bound search, thereby locating all the most parsimonious trees. I've tried the phangorn package which has a parsimony ratchet function that works really well, but within paleontology (my field) there are still a good number of people that want an exhaustive search. Alternatively, if there is no exhaustive search, what would be the best tree search function for morphological (character) data? Most of the discussions I can find revolve more around DNA/molecular data, which are hard to come by in fossils (unless you are Michael Crichton). Thanks Matthew Vavrek ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- David Bapst Dept of Geophysical Sciences University of Chicago 5734 S. Ellis Chicago, IL 60637 http://home.uchicago.edu/~dwbapst/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] number of rate categories using fitDiscrete
Dear all, I have been trying to get estimates of transition rates among character states using fitDiscrete in geiger but it returns the following error message: fitDiscrete(tree, h1, model=ARD) Warning: some tree transformations in GEIGER might not be sensible for nonultrametric trees. Finding the maximum likelihood solution [050 100] [] Error in getQ(exp(out$par), nRateCats, model) : You must supply the correct number of rate categories. I have traits with either 3 or 4 character states, and I have also tried different trees and always had the same result when attempted to fit an 'all-rates-differ' model. Any tips on what may be going on and how to fix it? Thanks a lot in advance, ivan Ivan Gomez-Mestre Universidad de Oviedo Spain [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)
Matthew. The only thing that I would add is that if you *really* want to do an exhaustive search in R, and your species number is small enough to permit an exhaustive search (i.e., =10), then it is straightforward enough to do so: require(phangorn) data-read.phyDat(file=filename,type=USER) # for binary data all.trees-allTrees(n=length(data),tip.label=names(data),rooted=FALSE) pscores-vector() for(i in 1:length(all.trees)) pscores[i]-parsimony(all.trees[[i]],data) minscore-min(pscores); mp.tree-all.trees[pscores==minscore] mp.tree will be a single MP tree or a list if there are several MP trees. Of course this will take a long time for more than a 7 or 8 species, and will not work at all for more than 10 species. It's also possible that an exhaustive search in a traditional phylogeny inference program like PAUP* might be faster if, for instance, PAUP* retains the score for parts of the tree that don't change among a set of trees - instead of recalculating it each time anew. That I don't know. - Liam -- Liam J. Revell University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://phytools.blogspot.com On 3/28/2011 10:58 AM, Ross Mounce wrote: Dear Matthew, Branch and Bound searching is often unneccessary, are you sure you need to do this? With enough random addition sequences (relative to dataset size) you *will* find all MPTs. PAUP* despite being familiar to many of us and very fully-featured, is definitely ungainly for modern analyses with it's lack of parallelisation. Have you considered using TNT? http://www.cladistics.com/aboutTNT.html Wiki Manual here: http://tnt.insectmuseum.org/index.php/Main_Page Many paleontologists are starting to use this - computationally it's far more efficient. Just remember, as with any and all phylogenetic analyses it's garbage in, garbage out: you must know what you're doing and why *before* you start your analysis. Kind Regards and good luck, Ross Mounce ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)
Thanks Liam, I have two small additions, the first on speeds things up but costs memory, the second one just simplifies things a bit (and is maybe also a bit faster). Regards, Klaus On 3/28/11, Liam J. Revell liam.rev...@umb.edu wrote: Matthew. The only thing that I would add is that if you *really* want to do an exhaustive search in R, and your species number is small enough to permit an exhaustive search (i.e., =10), then it is straightforward enough to do so: require(phangorn) data-read.phyDat(file=filename,type=USER) # for binary data all.trees-allTrees(n=length(data),tip.label=names(data),rooted=FALSE) all.trees = .uncompressTipLabel(all.trees) # I know this is ugly pscores-vector() for(i in 1:length(all.trees)) pscores[i]-parsimony(all.trees[[i]],data) # forget the last 3 lines just use pscores -parsimony(all.trees,data) # pscores allows multiPhylo objects minscore-min(pscores); mp.tree-all.trees[pscores==minscore] mp.tree will be a single MP tree or a list if there are several MP trees. Of course this will take a long time for more than a 7 or 8 species, and will not work at all for more than 10 species. It's also possible that an exhaustive search in a traditional phylogeny inference program like PAUP* might be faster if, for instance, PAUP* retains the score for parts of the tree that don't change among a set of trees - instead of recalculating it each time anew. That I don't know. - Liam -- Liam J. Revell University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://phytools.blogspot.com On 3/28/2011 10:58 AM, Ross Mounce wrote: Dear Matthew, Branch and Bound searching is often unneccessary, are you sure you need to do this? With enough random addition sequences (relative to dataset size) you *will* find all MPTs. PAUP* despite being familiar to many of us and very fully-featured, is definitely ungainly for modern analyses with it's lack of parallelisation. Have you considered using TNT? http://www.cladistics.com/aboutTNT.html Wiki Manual here: http://tnt.insectmuseum.org/index.php/Main_Page Many paleontologists are starting to use this - computationally it's far more efficient. Just remember, as with any and all phylogenetic analyses it's garbage in, garbage out: you must know what you're doing and why *before* you start your analysis. Kind Regards and good luck, Ross Mounce ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Klaus Schliep Université Paris 6 (Pierre et Marie Curie) 9, Quai Saint-Bernard, 75005 Paris ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)
Regarding finding all most parsimonious trees by branch-and-bound: Program Penny in my PHYLIP package could be called from within R, driven by batch scripts. You'd have to make them yourself, but it's not hard. However Penny can handle only 0/1 characters, and if there is a multifurcation it will return all the binary trees compatible with that. It is also much slower than PAUP*. It would get you a file of Newick trees. Joe Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo