[R-sig-phylo] The Speed of Different Methods For Obtaining the Descendant Tip Taxa per Node

2011-03-06 Thread David Bapst
Hello all,
I'm currently trying to measure a parameter over a large number of
large trees (1700 tips), and part of this calculation requires
knowing the tip taxa descended from each node (I must compare the
difference in tip values among taxa descended from a node). Because I
must do this many times,  I decided to compare the efficiency of
several methods for doing this in various R libraries with
system.time() (as Liam did recently with some BM simulation functions
in some recent blog posts). As I feel that others may benefit from
this comparison of methods, I am posting the results to this list.

Geiger's node.leaves() is much faster than the alternatives, although
at ~13s a run, it is still not a particularly speedy process. I didn't
need the actual tip.labels, so I took node.leaves() and made it as
lean as possible, to produce node.tips(), below. That cut the run time
down to ~9 sec.

node.tips-function (phy, node){
n - length(phy$tip.label)
if (node = n){node}else{
l- numeric()
d- phy$edge[which(phy$edge[,1]==node),2]
for(j in d){if(j = n){l - c(l, j)}else{l-c(l, 
node.tips(phy,j))}}
   l}}

If anyone knows of another alternative that might be faster, I would
greatly appreciate any suggestions.
-Dave Bapst, UChicago Geosci

Using the modified node.leaves() function from geiger, node.tips()

 system.time(desc-sapply(edge_end,function(x) node.tips(res_tree,x)))
   user  system elapsed
   9.390.009.44

Using node.leaves() in geiger

 system.time(desc-sapply(edge_end,function(x) node.leaves(res_tree,x)))
   user  system elapsed
  13.290.02   13.34

Using Descendants() in phangorn

 system.time(desc-sapply(edge_end,function(x) Descendants(res_tree,x)))
   user  system elapsed
  75.600.10   76.83

Using listTips() in adephylo

 system.time(desc_list-c(as.list(1:Ntip(res_tree)),listTips(res_tree)))
   user  system elapsed
  75.272.93   87.28

Using descendants() in phylobase

 system.time(desc-sapply(edge_end,function(x) 
 descendants(res_tree4,x,type=tips)))
   user  system elapsed
 149.560.67  155.15

Using nodeDecendants() in maticce  (note that translating the tree
into ouchtree format was prohibitively very lengthy)

 system.time(res_tree_ou-ape2ouch(res_tree))
   user  system elapsed
  84.010.13   85.34
 system.time(desc-sapply(edge_end,function(x) nodeDescendents(res_tree_ou,x)))
   user  system elapsed
  65.910.23   68.47

-- 
David Bapst
Dept of Geophysical Sciences
University of Chicago
5734 S. Ellis
Chicago, IL 60637
http://home.uchicago.edu/~dwbapst/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Dealing with Bounded Trait Measures

2011-03-06 Thread David Bapst
Marguerite and others-
I had missed that Scales et al paper. Thank for pointing it out. I
have also had to deal with variables that were bounded by 0 and 1
previously, and the use of the logit is something I'll have to try
when I go back to those analyses.

Although I am interested in transformations of such traits for
comparative analyses in general (measuring correlated trait evolution,
model fitting of OU, measurements of rate, etc.), I am mainly
interested in measurements of phylogenetic signal. Is there any reason
to think the kludge (adding an arbitrarily small value to trait
values) would bias a test for phylogenetic signal (i.e. do the trait
show a pattern of inheritance)? I'm worried that measurements of
signal might be prone to odd results with the kludge, because the data
structure would still be quite different from BM expectation. Perhaps,
the threshold model might be of a lot of use here; I'm going to have
to look into that much more.

Thanks for you advice, all!
-Dave


On Sun, Mar 6, 2011 at 12:50 AM, Marguerite Butler mbut...@hawaii.edu wrote:
 Hi David, Liam and everyone,

 Reflecting traits at boundaries or absorbing them is something that can be 
 done, but I guess I'd like to encourage everyone to think carefully about the 
 interpretation of such simulations. What are you trying to model and what 
 does it mean at the end? Doing these boundary manipulations produces odd 
 shaped-distributions of traits at the end of the simulated process. But 
 probably more importantly, does this a good model for the biological process 
 which might be occurring? If so, then great. But I would guess not with 
 traits such as spine length.

 The zero issue is a hard problem. Is it a combination of a discrete trait 
 (presence/absence) and a continuous trait (length)? Or perhaps a better model 
 would be a threshold trait with underlying continuous variation but below 
 some boundary it will not be expressed? Has anyone developed a model that can 
 be used for such a scenario? In the absence of such a model, I've done things 
 like what Enrico suggests, with a kludge solution like adding a very small 
 value to all values to avoid the zero. This doesn't make much difference to 
 the analysis, and avoids the singularity at log(0). If you have a trait that 
 is a frequency, you can use the logit() function instead of log(), as we did 
 in Scales et al 2009.

 Scales J.A., King A.A., and Butler M.A. (2009) Running for your life or 
 running for your dinner: What drives fiber-type evolution in lizard locomotor 
 muscles? Am. Nat.173:543-553.

 Marguerite

 On Mar 5, 2011, at 11:06 AM, Liam J. Revell wrote:

 I'm not sure that this is what Dave has in mind, but if anyone is interested 
 in simulating bounded evolution in R, I just added it to my fastBM() 
 function (code here: 
 http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/fastBM/v0.3/fastBM.R).

 In the process of evolving traits up the tree, I just bounce back any 
 phenotypes that exceed the lower or upper boundary conditions specified by 
 the user (by default they are -Inf and Inf).  I think I did this properly.  
 Feedback welcome though.

 - Liam

 --
 Liam J. Revell
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 (new) email: liam.rev...@umb.edu
 (new) blog: http://phytools.blogspot.com

 On 3/5/2011 12:55 PM, tgarl...@ucr.edu wrote:
 Hello David, Enrico, et al.,

 I may have lost track of what Dave was originally trying to do, and I am 
 not familiar with all of the options presently available in r for 
 simulating continuously valued traits along a specified phylogenetic tree.  
 However, I wanted to point out that MANY possibilities, including trends, 
 the OU process, and actual limits to trait evolution implemented in several 
 ways, are available in our original DOS program PDSIMUL.EXE that 
 accompanies this paper:

 Garland, T., Jr., A. W. Dickerman, C. M. Janis, and J. A. Jones. 1993. 
 Phylogenetic analysis of covariance by computer simulation. Systematic 
 Biology 42:265-292.

 It has been used many times to look at trends, limits, etc., e.g., in these 
 papers:

 Díaz-Uriarte, R., and T. Garland, Jr. 1996. Testing hypotheses of 
 correlated evolution using phylogenetically independent contrasts: 
 sensitivity to deviations from Brownian motion. Systematic Biology 45:27-47.
 Laurin, M. 2010. Assessment of the relative merits of a few methods to 
 detect evolutionary trends. Syst. Biol. 59:689-704.

 Cheers,
 Ted



 Theodore Garland, Jr.
 Professor
 Department of Biology
 University of California, Riverside
 Riverside, CA 92521
 Office Phone:  (951) 827-3524
 Lab Phone:  (951) 827-5724
 Home Phone:  (951) 328-0820
 Facsimile:  (951) 827-4286 = Dept. office (not confidential)
 Email:  tgarl...@ucr.edu

 Main Departmental page:
 http://www.biology.ucr.edu/people/faculty/Garland.html

 List of all Publications:
 http://www.biology.ucr.edu/people/faculty/Garland/GarlandPublications.html

 Garland and 

Re: [R-sig-phylo] The Speed of Different Methods For Obtaining the Descendant Tip Taxa per Node

2011-03-06 Thread Liam J. Revell

Hi Dave.

It seems like one way to get these values faster would be to count the 
number of descendants as the tree is read in to R.  This is possible 
because when the ) character is reached by the Newick parser, all 
descendant nodes (and tips) have already been created in memory.


This was simple to add to my tree reading function read.newick() [code 
here: 
http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/read.newick/v0.3/read.newick.R].  
Now v0.3 of the function returns a modified phylo object with the 
vector $Ndesc containing the number of descendants from each internal 
and terminal node above the root (in the order of $edge[,2]).  This is 
exactly the same vector as would be obtained by the following commands:


 desc-sapply(tree$edge[,2],function(x) node.leaves(tree,x))
 Ndesc-sapply(desc,function(x) length(x))

Note that read.newick() only reads one tree at a time in phylip format 
(and without node labels) - but reading multiple trees or node labels, 
or reading from nexus format, would quite easy to add if you find that 
this is indeed faster.


In addition, I believe that newick2phylog() creates this list in reading 
a Newick tree into memory as a phylog object.  This is stored as 
phy$Aparam$nlea for phylog object phy.  Unfortunately, it seems like 
newick2phylog() is very slow for large trees.


- Liam

--
Liam J. Revell
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://phytools.blogspot.com


On 3/6/2011 12:47 PM, David Bapst wrote:

Hello all,
I'm currently trying to measure a parameter over a large number of
large trees (1700 tips), and part of this calculation requires
knowing the tip taxa descended from each node (I must compare the
difference in tip values among taxa descended from a node). Because I
must do this many times,  I decided to compare the efficiency of
several methods for doing this in various R libraries with
system.time() (as Liam did recently with some BM simulation functions
in some recent blog posts). As I feel that others may benefit from
this comparison of methods, I am posting the results to this list.

Geiger's node.leaves() is much faster than the alternatives, although
at ~13s a run, it is still not a particularly speedy process. I didn't
need the actual tip.labels, so I took node.leaves() and made it as
lean as possible, to produce node.tips(), below. That cut the run time
down to ~9 sec.

node.tips-function (phy, node){
n- length(phy$tip.label)
if (node= n){node}else{
l- numeric()
d- phy$edge[which(phy$edge[,1]==node),2]
for(j in d){if(j= n){l- c(l, j)}else{l-c(l, 
node.tips(phy,j))}}
l}}

If anyone knows of another alternative that might be faster, I would
greatly appreciate any suggestions.
-Dave Bapst, UChicago Geosci

Using the modified node.leaves() function from geiger, node.tips()


system.time(desc-sapply(edge_end,function(x) node.tips(res_tree,x)))

user  system elapsed
9.390.009.44

Using node.leaves() in geiger


system.time(desc-sapply(edge_end,function(x) node.leaves(res_tree,x)))

user  system elapsed
   13.290.02   13.34

Using Descendants() in phangorn


system.time(desc-sapply(edge_end,function(x) Descendants(res_tree,x)))

user  system elapsed
   75.600.10   76.83

Using listTips() in adephylo


system.time(desc_list-c(as.list(1:Ntip(res_tree)),listTips(res_tree)))

user  system elapsed
   75.272.93   87.28

Using descendants() in phylobase


system.time(desc-sapply(edge_end,function(x) 
descendants(res_tree4,x,type=tips)))

user  system elapsed
  149.560.67  155.15

Using nodeDecendants() in maticce  (note that translating the tree
into ouchtree format was prohibitively very lengthy)


system.time(res_tree_ou-ape2ouch(res_tree))

user  system elapsed
   84.010.13   85.34

system.time(desc-sapply(edge_end,function(x) nodeDescendents(res_tree_ou,x)))

user  system elapsed
   65.910.23   68.47



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] The Speed of Different Methods For Obtaining the Descendant Tip Taxa per Node

2011-03-06 Thread David Bapst
Klaus-
Oh, that worked rather splendid! Thanks for letting me know.

 system.time(desc-Descendants(res_tree,edge_end))
   user  system elapsed
   1.560.001.56

-Dave

On Sun, Mar 6, 2011 at 3:22 PM, Klaus Schliep klaus.schl...@gmail.com wrote:
 Hi David,

 you can supply Descendents (from phangorn) with a vector instead of
 using sapply. I should have mentioned this in the help file.
 If your vector (edge_end) is long enough (I don't have a exact number
 here) this can be much faster than using sapply as some results are
 reused.
 Here are some timings:

 set.seed(123)
 res_tree = rtree(1700)
 edge_end = unique(res_tree$edge[,1])
 system.time(desc1-sapply(edge_end,function(x) Descendants(res_tree,x)))
   user  system elapsed
  54.51    0.00   54.87
 system.time(desc2-Descendants(res_tree,edge_end))
   user  system elapsed
   0.75    0.00    0.75
 system.time(desc3-sapply(edge_end,function(x) node.tips(res_tree,x)))
   user  system elapsed
   4.07    0.00    4.07


 Cheers,
 Klaus



 On 3/6/11, David Bapst dwba...@uchicago.edu wrote:
 Hello all,
 I'm currently trying to measure a parameter over a large number of
 large trees (1700 tips), and part of this calculation requires
 knowing the tip taxa descended from each node (I must compare the
 difference in tip values among taxa descended from a node). Because I
 must do this many times,  I decided to compare the efficiency of
 several methods for doing this in various R libraries with
 system.time() (as Liam did recently with some BM simulation functions
 in some recent blog posts). As I feel that others may benefit from
 this comparison of methods, I am posting the results to this list.

 Geiger's node.leaves() is much faster than the alternatives, although
 at ~13s a run, it is still not a particularly speedy process. I didn't
 need the actual tip.labels, so I took node.leaves() and made it as
 lean as possible, to produce node.tips(), below. That cut the run time
 down to ~9 sec.

 node.tips-function (phy, node){
       n - length(phy$tip.label)
       if (node = n){node}else{
               l- numeric()
               d- phy$edge[which(phy$edge[,1]==node),2]
               for(j in d){if(j = n){l - c(l, j)}else{l-c(l, 
 node.tips(phy,j))}}
        l}}

 If anyone knows of another alternative that might be faster, I would
 greatly appreciate any suggestions.
 -Dave Bapst, UChicago Geosci

 Using the modified node.leaves() function from geiger, node.tips()

 system.time(desc-sapply(edge_end,function(x) node.tips(res_tree,x)))
    user  system elapsed
    9.39    0.00    9.44

 Using node.leaves() in geiger

 system.time(desc-sapply(edge_end,function(x) node.leaves(res_tree,x)))
    user  system elapsed
   13.29    0.02   13.34

 Using Descendants() in phangorn

 system.time(desc-sapply(edge_end,function(x) Descendants(res_tree,x)))
    user  system elapsed
   75.60    0.10   76.83

 Using listTips() in adephylo

 system.time(desc_list-c(as.list(1:Ntip(res_tree)),listTips(res_tree)))
    user  system elapsed
   75.27    2.93   87.28

 Using descendants() in phylobase

 system.time(desc-sapply(edge_end,function(x)
 descendants(res_tree4,x,type=tips)))
    user  system elapsed
  149.56    0.67  155.15

 Using nodeDecendants() in maticce  (note that translating the tree
 into ouchtree format was prohibitively very lengthy)

 system.time(res_tree_ou-ape2ouch(res_tree))
    user  system elapsed
   84.01    0.13   85.34
 system.time(desc-sapply(edge_end,function(x)
 nodeDescendents(res_tree_ou,x)))
    user  system elapsed
   65.91    0.23   68.47

 --
 David Bapst
 Dept of Geophysical Sciences
 University of Chicago
 5734 S. Ellis
 Chicago, IL 60637
 http://home.uchicago.edu/~dwbapst/

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



 --
 Klaus Schliep
 Université Paris 6 (Pierre et Marie Curie)
 9, Quai Saint-Bernard, 75005 Paris




-- 
David Bapst
Dept of Geophysical Sciences
University of Chicago
5734 S. Ellis
Chicago, IL 60637
http://home.uchicago.edu/~dwbapst/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Dealing with Bounded Trait Measures

2011-03-06 Thread Marguerite Butler
Hi David,

I am not sure what you mean. If you add a tiny value to all of your data, then 
all you've done is shifted the mean ever so slightly. You then log-transform to 
put the variables on a scale unbounded by 0, and then you measure phylogenetic 
signal. The BM process models evolutionary changes as random draws from a 
normal distribution. So this would be the log-transformed values here.  Adding 
a tiny value to all data doesn't change things very much. Log transformation 
changes things a lot more. So you're modeling a BM process on the 
log-transformed values. But there is nothing illegal about that. You do have 
to think about it, however. 

This makes a lot of sense to me when I'm modeling body size evolution, because 
the model would imply that when you're small, evolutionary changes tend to be 
smaller than when you're big (evolutionary changes on larger animals would tend 
to have bigger jumps on the original scale). Ignoring any clade-specific 
biases, evolutionary changes in the body size of mice would tend to be a lot 
smaller than evolutionary changes in the size of horses, for example. We can 
imagine that some simple mechanistic reasons such as their subunits are smaller 
(they have smaller cells, smaller vertebrae, etc., so adding on or taking away 
would have bigger size effects). This seems biologically reasonable. I am not 
sure if it would apply to your spine size data, but maybe it does. Of course if 
your size variation is not that great, then it doesn't much matter (because 
log-trasformation will then not have much effect on compressing the scale of 
the values).

Anyway, this may be a longer-winded comment than you were asking for. I can't 
think of any reason why adding tiny values would do anything to the 
evolutionary simulation.

Marguerite


On Mar 6, 2011, at 9:53 AM, David Bapst wrote:

 Marguerite and others-
 I had missed that Scales et al paper. Thank for pointing it out. I
 have also had to deal with variables that were bounded by 0 and 1
 previously, and the use of the logit is something I'll have to try
 when I go back to those analyses.
 
 Although I am interested in transformations of such traits for
 comparative analyses in general (measuring correlated trait evolution,
 model fitting of OU, measurements of rate, etc.), I am mainly
 interested in measurements of phylogenetic signal. Is there any reason
 to think the kludge (adding an arbitrarily small value to trait
 values) would bias a test for phylogenetic signal (i.e. do the trait
 show a pattern of inheritance)? I'm worried that measurements of
 signal might be prone to odd results with the kludge, because the data
 structure would still be quite different from BM expectation. Perhaps,
 the threshold model might be of a lot of use here; I'm going to have
 to look into that much more.
 
 Thanks for you advice, all!
 -Dave
 
 
 On Sun, Mar 6, 2011 at 12:50 AM, Marguerite Butler mbut...@hawaii.edu wrote:
 Hi David, Liam and everyone,
 
 Reflecting traits at boundaries or absorbing them is something that can be 
 done, but I guess I'd like to encourage everyone to think carefully about 
 the interpretation of such simulations. What are you trying to model and 
 what does it mean at the end? Doing these boundary manipulations produces 
 odd shaped-distributions of traits at the end of the simulated process. But 
 probably more importantly, does this a good model for the biological process 
 which might be occurring? If so, then great. But I would guess not with 
 traits such as spine length.
 
 The zero issue is a hard problem. Is it a combination of a discrete trait 
 (presence/absence) and a continuous trait (length)? Or perhaps a better 
 model would be a threshold trait with underlying continuous variation but 
 below some boundary it will not be expressed? Has anyone developed a model 
 that can be used for such a scenario? In the absence of such a model, I've 
 done things like what Enrico suggests, with a kludge solution like adding a 
 very small value to all values to avoid the zero. This doesn't make much 
 difference to the analysis, and avoids the singularity at log(0). If you 
 have a trait that is a frequency, you can use the logit() function instead 
 of log(), as we did in Scales et al 2009.
 
 Scales J.A., King A.A., and Butler M.A. (2009) Running for your life or 
 running for your dinner: What drives fiber-type evolution in lizard 
 locomotor muscles? Am. Nat.173:543-553.
 
 Marguerite
 
 On Mar 5, 2011, at 11:06 AM, Liam J. Revell wrote:
 
 I'm not sure that this is what Dave has in mind, but if anyone is 
 interested in simulating bounded evolution in R, I just added it to my 
 fastBM() function (code here: 
 http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/fastBM/v0.3/fastBM.R).
 
 In the process of evolving traits up the tree, I just bounce back any 
 phenotypes that exceed the lower or upper boundary conditions specified by 
 the user (by default they are -Inf and Inf).  I think I 

Re: [R-sig-phylo] The Speed of Different Methods For Obtaining the Descendant Tip Taxa per Node

2011-03-06 Thread Emmanuel Paradis

Hi David and others,

prop.part() with a single tree does what you want:


set.seed(123)
res_tree - rtree(1700)
system.time(desc2 - prop.part(res_tree))

   user  system elapsed
  0.050   0.000   0.053

edge_end - unique(res_tree$edge[,1])
system.time(desc1 - Descendants(res_tree,edge_end))

   user  system elapsed
  0.400   0.000   0.406

This gives the same answer than Descendants:


for (i in seq_along(desc1))

+ stopifnot(all(sort(desc1[[i]]) == sort(desc2[[i]])))




Klaus makes an excellent reminder about R programming: use vectorisation
whenever possible (you'll almost always win to do it).

Your examples are very instructive. One lesson is that it is quite easy
to write R code that tends to be slow. Another one is that it is also
easy to write different functions doing the same thing. I'm quite
impressed to see that several packages have re-implemented something
that has been in ape for some time.

I wish developpers (and users as well) could use these tools from ape
which have been developed specifically for this kind of objectives 
(fast, efficient, and reliable). Is it an issue of 'visibility' of these 
functions? This could be a project idea for the next GSoC.


Cheers,

Emmanuel

PS: if you want to speed-up your computations and you have a multi-core 
processor on your machine (which are common on laptops nowadays), you 
could try the multicore package (phangorn already supports it):


 library(multicore)
 job - parallel(prop.part(res_tree))
 system.time(out - collect(job))
   user  system elapsed
  0.080   0.010   0.002
 out - out[[1]]
 identical(out, desc2)
[1] TRUE

This might be even more interesting with lists of trees.


David Bapst wrote on 07/03/2011 04:32:

Klaus-
Oh, that worked rather splendid! Thanks for letting me know.


system.time(desc-Descendants(res_tree,edge_end))

   user  system elapsed
   1.560.001.56

-Dave

On Sun, Mar 6, 2011 at 3:22 PM, Klaus Schliep klaus.schl...@gmail.com wrote:

Hi David,

you can supply Descendents (from phangorn) with a vector instead of
using sapply. I should have mentioned this in the help file.
If your vector (edge_end) is long enough (I don't have a exact number
here) this can be much faster than using sapply as some results are
reused.
Here are some timings:


set.seed(123)
res_tree = rtree(1700)
edge_end = unique(res_tree$edge[,1])
system.time(desc1-sapply(edge_end,function(x) Descendants(res_tree,x)))

  user  system elapsed
 54.510.00   54.87

system.time(desc2-Descendants(res_tree,edge_end))

  user  system elapsed
  0.750.000.75

system.time(desc3-sapply(edge_end,function(x) node.tips(res_tree,x)))

  user  system elapsed
  4.070.004.07


Cheers,
Klaus



On 3/6/11, David Bapst dwba...@uchicago.edu wrote:

Hello all,
I'm currently trying to measure a parameter over a large number of
large trees (1700 tips), and part of this calculation requires
knowing the tip taxa descended from each node (I must compare the
difference in tip values among taxa descended from a node). Because I
must do this many times,  I decided to compare the efficiency of
several methods for doing this in various R libraries with
system.time() (as Liam did recently with some BM simulation functions
in some recent blog posts). As I feel that others may benefit from
this comparison of methods, I am posting the results to this list.

Geiger's node.leaves() is much faster than the alternatives, although
at ~13s a run, it is still not a particularly speedy process. I didn't
need the actual tip.labels, so I took node.leaves() and made it as
lean as possible, to produce node.tips(), below. That cut the run time
down to ~9 sec.

node.tips-function (phy, node){
  n - length(phy$tip.label)
  if (node = n){node}else{
  l- numeric()
  d- phy$edge[which(phy$edge[,1]==node),2]
  for(j in d){if(j = n){l - c(l, j)}else{l-c(l, 
node.tips(phy,j))}}
   l}}

If anyone knows of another alternative that might be faster, I would
greatly appreciate any suggestions.
-Dave Bapst, UChicago Geosci

Using the modified node.leaves() function from geiger, node.tips()


system.time(desc-sapply(edge_end,function(x) node.tips(res_tree,x)))

   user  system elapsed
   9.390.009.44

Using node.leaves() in geiger


system.time(desc-sapply(edge_end,function(x) node.leaves(res_tree,x)))

   user  system elapsed
  13.290.02   13.34

Using Descendants() in phangorn


system.time(desc-sapply(edge_end,function(x) Descendants(res_tree,x)))

   user  system elapsed
  75.600.10   76.83

Using listTips() in adephylo


system.time(desc_list-c(as.list(1:Ntip(res_tree)),listTips(res_tree)))

   user  system elapsed
  75.272.93   87.28

Using descendants() in phylobase


system.time(desc-sapply(edge_end,function(x)
descendants(res_tree4,x,type=tips)))

   user  system elapsed
 149.560.67  155.15

Using nodeDecendants() in maticce  (note that translating the tree
into ouchtree format was prohibitively very lengthy)