Re: [R-sig-phylo] Why does ace set a random number generator seed?
Hi Dave, The seed is created every time you generate random data (see details in ?.Random.seed): R> exists(".Random.seed") [1] FALSE R> sample(1) [1] 1 R> exists(".Random.seed") [1] TRUE So in your code below the seed is created by calling rtree(). Indeed, with ape 4.1: R> rm(".Random.seed") R> data(bird.orders) R> x <- 1:23 R> exists(".Random.seed") [1] FALSE R> o <- ace(x, bird.orders) R> exists(".Random.seed") [1] FALSE However, with the new "soon-to-be-on-CRAN" version, a seed is created by the same code. The "culprit" is actually Rcpp: the new ape has C++ code linked to Rcpp. For a reason I ignore, it seems that a random seed is initialized when calling a function linked to Rcpp. Here is an example: R> rm(".Random.seed") R> library(RcppEigen) R> exists(".Random.seed") [1] FALSE R> o <- fastLm(1, 1) R> exists(".Random.seed") [1] TRUE The new C++ code in ape is called by reorder(phy, "postorder") which is used by many functions in ape (ace, pic, plot.phylo, vcv, ...). Best, Emmanuel Le 19/10/2017 à 21:00, David Bapst a écrit : Emmanuel, all- I noticed today that a workspace I was working with had a random number seed set in it, but didn't remember setting one. Finally, I discovered the culprit was ace. Here's a reproducible example, demonstrating that a seed exists after running ace: library(ape) tree<-rtree(10) ace(1:10,tree) .Random.seed I am using ape 4.1, and it doesn't seem to be addressed in the forthcoming version, given my reading of the changes log (but I might have missed it). What's going? Why is a random number seed being set? Cheers, -Dave ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Brownian Evolution
Hi all, I've been investigating the use of Pagel's lambda and run into a problem comparing trees that have a low speciation rate and no extinction with trees that have a high speciation rate and lots of extinction. I'm comparing phylogenetic signal in two characters. In the first, I simulate a character under Brownian evolution over a 200 taxon tree and then prune it down to 100 taxa. The second character was evolved over the pruned-down tree. I then estimated Pagel's lambda for both characters on the pruned-down tree. Doing this 500 times, I found that the error rate in the estimate of lambda was not the same for the two methods of generating the data. The difference was small but statistically significant. To my understanding though, there shouldn't be a difference. The rate of evolution is the same in both cases and the tree topology and branch lengths are identical. If the characters are evolving in a Brownian way, shouldn't there be, on average, no difference between the characters with respect to phylogenetic signal? Could this be an artifact of using fastBM to simulate the characters? Is there another function that I should be using to simulate a character evolving under Brownian motion? Here's the script I'm using: library(phytools) library(TreeSim) library(adephylo) age<-2 lambda <- 2.0 mu <- 0.5 frac <-0.6 sample_size<-100 smalltrees<-list() Bigtrees<-sim.bd.taxa.age(200,500,lambda,mu,frac,age) ## using sim.bd.taxa.age to get an ultrametric tree with known age subtrees<-list() sig<-vector() sig_sub<-vector() for (i in 1:500){ var<-fastBM(Bigtrees[[i]],nsim=1) tips_to_drop<-sample(1:200, 200-sample_size, replace=FALSE) subtrees[[i]]<-drop.tip(Bigtrees[[i]], tips_to_drop) var1<-fastBM(subtrees[[i]],nsim=1) sig_sub[i]<-phylosig(subtrees[[i]],var,method="lambda")$lambda sig[i]<-phylosig(subtrees[[i]],var1,method="lambda")$lambda} lamb<-(2-sig) ## reflected the data around lambda=1 so that standard deviation has a symmetric distribution lamb<-c(lamb,sig) lamb_sub<-(2-sig_sub) lamb_sub<-c(lamb_sub,sig_sub) var.test(lamb,lamb_sub) Best regards, - Will Gelnaw [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Brownian Evolution
Hi William. Why would you suspect phytools? ;) I didn't figure this out for you, but I did eliminate some possibilities. It does not seem to be due to phytools::fastBM. You can replace fastBM with phytools:sim.corrs or geiger:sim.char as follows: var<-sim.char(Bigtrees[[i]],par=1,root=0)[,,1] ## or var<-sim.corrs(Bigtrees[[i]],matrix(1,1,1)) ## and var1<-sim.char(subtrees[[i]],par=1,root=0)[,,1] ## or var1<-sim.corrs(subtrees[[i]],matrix(1,1,1)) and the result is the same. (Note that all three use different methods for simulation.) Neither does it seem to be due to phytools::phylosig. You can replace phylosig with geiger::fitContinuous as follows: maxlambda<-phytools:::maxLambda(subtrees[[i]]) sig_sub[i]<-fitContinuous(subtrees[[i]],var[subtrees[[i]]$tip.label], model="lambda",bounds=list(lambda=c(0,maxlambda)))$opt$lambda sig[i]<-fitContinuous(subtrees[[i]],var1,model="lambda", bounds=list(lambda=c(0,maxlambda)))$opt$lambda and the result is unchanged. (Note that we need to compute the maximum value of lambda for which the likelihood function is defined otherwise fitContinuous by default bounds optimization on [0,1].) What I also observed is that the variances are unequal, as you report, but not in a particular direction. That is, sometimes sig_sub has a higher variance, and sometimes sig does. If I had to guess, I would hazard that perhaps the distribution of lambda (even after you make it symmetric) is leptokurtic which is why var.test is significant due to idiosyncratic differences in the variance of lambda, even though the values were obtained by the same process. (Invariably lamb_sub and lamb fail a normality test, so this seems like a distinct possibility.) All the best, Liam Liam J. Revell, Associate Professor of Biology University of Massachusetts Boston web: http://faculty.umb.edu/liam.revell/ email: liam.rev...@umb.edu blog: http://blog.phytools.org On 10/20/2017 1:32 AM, William Gelnaw wrote: Hi all, I've been investigating the use of Pagel's lambda and run into a problem comparing trees that have a low speciation rate and no extinction with trees that have a high speciation rate and lots of extinction. I'm comparing phylogenetic signal in two characters. In the first, I simulate a character under Brownian evolution over a 200 taxon tree and then prune it down to 100 taxa. The second character was evolved over the pruned-down tree. I then estimated Pagel's lambda for both characters on the pruned-down tree. Doing this 500 times, I found that the error rate in the estimate of lambda was not the same for the two methods of generating the data. The difference was small but statistically significant. To my understanding though, there shouldn't be a difference. The rate of evolution is the same in both cases and the tree topology and branch lengths are identical. If the characters are evolving in a Brownian way, shouldn't there be, on average, no difference between the characters with respect to phylogenetic signal? Could this be an artifact of using fastBM to simulate the characters? Is there another function that I should be using to simulate a character evolving under Brownian motion? Here's the script I'm using: library(phytools) library(TreeSim) library(adephylo) age<-2 lambda <- 2.0 mu <- 0.5 frac <-0.6 sample_size<-100 smalltrees<-list() Bigtrees<-sim.bd.taxa.age(200,500,lambda,mu,frac,age) ## using sim.bd.taxa.age to get an ultrametric tree with known age subtrees<-list() sig<-vector() sig_sub<-vector() for (i in 1:500){ var<-fastBM(Bigtrees[[i]],nsim=1) tips_to_drop<-sample(1:200, 200-sample_size, replace=FALSE) subtrees[[i]]<-drop.tip(Bigtrees[[i]], tips_to_drop) var1<-fastBM(subtrees[[i]],nsim=1) sig_sub[i]<-phylosig(subtrees[[i]],var,method="lambda")$lambda sig[i]<-phylosig(subtrees[[i]],var1,method="lambda")$lambda} lamb<-(2-sig) ## reflected the data around lambda=1 so that standard deviation has a symmetric distribution lamb<-c(lamb,sig) lamb_sub<-(2-sig_sub) lamb_sub<-c(lamb_sub,sig_sub) var.test(lamb,lamb_sub) Best regards, - Will Gelnaw [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Why does ace set a random number generator seed?
Thanks, Emmanuel. I was not familiar with .Random.seed objects until R CHECK started complaining about such elements in saved workspaces within .rdata files within packages (requiring rm to remove them), and then was mainly familiar with them regarding set.seed() (and so I thought ace must be setting a seed, not just initializing one). It makes sense that any random number generation would incur a seed, and apparently fresh R sessions do not immediately initialize a seed. Thanks to everyone who helped me see my confusion! Just me missing a basic comp sci concept. -Dave On Fri, Oct 20, 2017 at 4:26 AM, Emmanuel Paradiswrote: > Hi Dave, > > The seed is created every time you generate random data (see details in > ?.Random.seed): > > R> exists(".Random.seed") > [1] FALSE > R> sample(1) > [1] 1 > R> exists(".Random.seed") > [1] TRUE > > So in your code below the seed is created by calling rtree(). Indeed, with > ape 4.1: > > R> rm(".Random.seed") > R> data(bird.orders) > R> x <- 1:23 > R> exists(".Random.seed") > [1] FALSE > R> o <- ace(x, bird.orders) > R> exists(".Random.seed") > [1] FALSE > > However, with the new "soon-to-be-on-CRAN" version, a seed is created by the > same code. The "culprit" is actually Rcpp: the new ape has C++ code linked > to Rcpp. For a reason I ignore, it seems that a random seed is initialized > when calling a function linked to Rcpp. Here is an example: > > R> rm(".Random.seed") > R> library(RcppEigen) > R> exists(".Random.seed") > [1] FALSE > R> o <- fastLm(1, 1) > R> exists(".Random.seed") > [1] TRUE > > The new C++ code in ape is called by reorder(phy, "postorder") which is used > by many functions in ape (ace, pic, plot.phylo, vcv, ...). > > Best, > > Emmanuel > > > Le 19/10/2017 à 21:00, David Bapst a écrit : >> >> Emmanuel, all- >> >> I noticed today that a workspace I was working with had a random >> number seed set in it, but didn't remember setting one. Finally, I >> discovered the culprit was ace. Here's a reproducible example, >> demonstrating that a seed exists after running ace: >> >> library(ape) >> tree<-rtree(10) >> ace(1:10,tree) >> .Random.seed >> >> I am using ape 4.1, and it doesn't seem to be addressed in the >> forthcoming version, given my reading of the changes log (but I might >> have missed it). What's going? Why is a random number seed being set? >> >> Cheers, >> -Dave >> > -- David W. Bapst, PhD Postdoc, Ecology & Evolutionary Biology, Univ of Tenn Knoxville Lecturer, Geology & Geophysics, Texas A & M University https://github.com/dwbapst/paleotree Google Calender: https://goo.gl/EpiM4J ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/