Re: [R-sig-phylo] Branch and Bound Maximum Parsimony

2011-03-28 Thread David Bapst
Matthew-
In addition to Ross's suggestion of TNT, I would recommend that you consider
options less commonly used by paleontologists. One option is Mr. Bayes,
which can apply the Mk2 model from Lewis (2001) to morphological data.
Another is Strataphy (Marcot and Fox, 2008) which can apply stratocladistics
(if you are willing to work with the assumptions of that method; I realize
some people have strong opinions against it). Note that both these methods
require that autapomorphies are included.
-Dave Bapst, UChicago

On Mon, Mar 28, 2011 at 12:34 AM, Matthew Vavrek
matt...@matthewvavrek.comwrote:

 Hello,
 although I know a lot of people have moved away from it, I was wondering if
 there is an implementation of a tree search function using maximum parsimony
 with a branch and bound option for R. I work in a lab that has been using
 PAUP forever, but I'm trying to see if R would be a viable alternative
 (especially since PAUP seems to be in some sort of limbo, from what I can
 gather on the intertubes). However, maximum parsimony is typically what is
 used, and usually the datasets that were being used could be run in a
 relatively short (ie 15 minutes) time span in PAUP using a branch-and-bound
 search, thereby locating all the most parsimonious trees. I've tried the
 phangorn package which has a parsimony ratchet function that works really
 well, but within paleontology (my field) there are still a good number of
 people that want an exhaustive search. Alternatively, if there is no
 exhaustive search, what would be the best tree search function for
 morphological (character) data? Most of the discussions I can find revolve
 more around DNA/molecular data, which are hard to come by in fossils (unless
 you are Michael Crichton).

 Thanks
 Matthew Vavrek

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo




-- 
David Bapst
Dept of Geophysical Sciences
University of Chicago
5734 S. Ellis
Chicago, IL 60637
http://home.uchicago.edu/~dwbapst/

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] number of rate categories using fitDiscrete

2011-03-28 Thread Ivan
Dear all,

I have been trying to get estimates of transition rates among  
character states using fitDiscrete in geiger but it returns the  
following error message:

  fitDiscrete(tree, h1, model=ARD)
Warning: some tree transformations in GEIGER might not be sensible  
for nonultrametric trees.
Finding the maximum likelihood solution
[050  100]
[]
Error in getQ(exp(out$par), nRateCats, model) :
   You must supply the correct number of rate categories.
 

I have traits with either 3 or 4 character states, and I have also  
tried different trees and always had the same result when attempted  
to fit an 'all-rates-differ' model. Any tips on what may be going on  
and how to fix it?

Thanks a lot in advance,

ivan

Ivan Gomez-Mestre
Universidad de Oviedo
Spain 
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)

2011-03-28 Thread Liam J. Revell

Matthew.

The only thing that I would add is that if you *really* want to do an 
exhaustive search in R, and your species number is small enough to 
permit an exhaustive search (i.e., =10), then it is straightforward 
enough to do so:


 require(phangorn)
 data-read.phyDat(file=filename,type=USER) # for binary data
 all.trees-allTrees(n=length(data),tip.label=names(data),rooted=FALSE)
 pscores-vector()
 for(i in 1:length(all.trees))
pscores[i]-parsimony(all.trees[[i]],data)
 minscore-min(pscores); mp.tree-all.trees[pscores==minscore]

mp.tree will be a single MP tree or a list if there are several MP trees.

Of course this will take a long time for more than a 7 or 8 species, and 
will not work at all for more than 10 species.  It's also possible that 
an exhaustive search in a traditional phylogeny inference program like 
PAUP* might be faster if, for instance, PAUP* retains the score for 
parts of the tree that don't change among a set of trees - instead of 
recalculating it each time anew.  That I don't know.


- Liam

--
Liam J. Revell
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://phytools.blogspot.com

On 3/28/2011 10:58 AM, Ross Mounce wrote:

Dear Matthew,


Branch and Bound searching is often unneccessary, are you sure you need to do 
this? With enough random addition sequences (relative to dataset size) you 
*will* find all MPTs.

PAUP* despite being familiar to many of us and very fully-featured, is 
definitely ungainly for modern analyses with it's lack of parallelisation.

Have you considered using TNT? http://www.cladistics.com/aboutTNT.html
Wiki Manual here: http://tnt.insectmuseum.org/index.php/Main_Page

Many paleontologists are starting to use this - computationally it's far more 
efficient.
Just remember, as with any and all phylogenetic analyses it's garbage in, garbage 
out: you must know what you're doing and why *before* you start your analysis.


Kind Regards and good luck,


Ross Mounce




___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)

2011-03-28 Thread Klaus Schliep
Thanks Liam,

I have two small additions, the first on speeds things up but costs
memory, the second one just simplifies things a bit (and is maybe also
a bit faster).

Regards,
Klaus

On 3/28/11, Liam J. Revell liam.rev...@umb.edu wrote:
 Matthew.

 The only thing that I would add is that if you *really* want to do an
 exhaustive search in R, and your species number is small enough to
 permit an exhaustive search (i.e., =10), then it is straightforward
 enough to do so:

   require(phangorn)
   data-read.phyDat(file=filename,type=USER) # for binary data
   all.trees-allTrees(n=length(data),tip.label=names(data),rooted=FALSE)
all.trees = .uncompressTipLabel(all.trees)
# I know this is ugly
   pscores-vector()
   for(i in 1:length(all.trees))
 pscores[i]-parsimony(all.trees[[i]],data)
# forget the last 3 lines just use
pscores -parsimony(all.trees,data)
# pscores allows multiPhylo objects
   minscore-min(pscores); mp.tree-all.trees[pscores==minscore]

 mp.tree will be a single MP tree or a list if there are several MP trees.

 Of course this will take a long time for more than a 7 or 8 species, and
 will not work at all for more than 10 species.  It's also possible that
 an exhaustive search in a traditional phylogeny inference program like
 PAUP* might be faster if, for instance, PAUP* retains the score for
 parts of the tree that don't change among a set of trees - instead of
 recalculating it each time anew.  That I don't know.

 - Liam

 --
 Liam J. Revell
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://phytools.blogspot.com

 On 3/28/2011 10:58 AM, Ross Mounce wrote:
 Dear Matthew,


 Branch and Bound searching is often unneccessary, are you sure you need to
 do this? With enough random addition sequences (relative to dataset size)
 you *will* find all MPTs.

 PAUP* despite being familiar to many of us and very fully-featured, is
 definitely ungainly for modern analyses with it's lack of parallelisation.

 Have you considered using TNT? http://www.cladistics.com/aboutTNT.html
 Wiki Manual here: http://tnt.insectmuseum.org/index.php/Main_Page

 Many paleontologists are starting to use this - computationally it's far
 more efficient.
 Just remember, as with any and all phylogenetic analyses it's garbage in,
 garbage out: you must know what you're doing and why *before* you start
 your analysis.


 Kind Regards and good luck,


 Ross Mounce



 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



-- 
Klaus Schliep
Université Paris 6 (Pierre et Marie Curie)
9, Quai Saint-Bernard, 75005 Paris

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Branch and Bound Maximum Parsimony (Matthew Vavrek)

2011-03-28 Thread Joe Felsenstein

Regarding finding all most parsimonious trees by branch-and-bound:

Program Penny in my PHYLIP package could be called from within R,
driven by batch scripts.  You'd have to make them yourself, but it's
not hard.  However Penny can handle only 0/1 characters, and if there
is a multifurcation it will return all the binary trees compatible with
that.  It is also much slower than PAUP*.

It would get you a file of Newick trees.

Joe

Joe Felsenstein j...@gs.washington.edu
 Department of Genome Sciences and Department of Biology,
 University of Washington, Box 355065, Seattle, WA 98195-5065 USA

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo