[R-sig-phylo] diversitree with multicore
Dear all, I have been trying to come up with easy ways to split diversitree runs among processors when using multiple trees. It seems that the multicore package would be a good way of doing it, but I have failed to use multicore with diversitree functions. Essentially, I cannot figure out how to tell multicore that the function takes two set of objects, one containing the likelihood functions and another one containing the parameters. Does anybody have experience and/or advice doing this? Any guidance would be most welcome. Other alternatives for splitting diversitree jobs among processors would also be highly appreciated. Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] diversitree with multicore
Hey Rafael- Hey, I'm using mclapply right now! What I'm doing with it is a trick I use often in lapply and sapply statements. If I have, say a list 'A' and a vector 'B' of the same length that I want to analyze in a lapply statement which analyzes each corresponding pair, I do the following: mclapply(1:length(A),function(x) myfunctionhere(A[[x]], B[x])) So, in this way, you could put each likelihood function in a list with the corresponding starting parameters in another list of the same length and then use the code above. Also, you can also do more complex things, say if you want to end up with a nested list where you analyze each element of A with every element of B... mclapply(1:length(A),function(x) lapply(1:length(B),function(y) myfunctionhere(A[[x]], B[y])) -Dave On Wed, Mar 21, 2012 at 10:58 AM, Rafael Rubio de Casas r...@nescent.orgwrote: Dear all, I have been trying to come up with easy ways to split diversitree runs among processors when using multiple trees. It seems that the multicore package would be a good way of doing it, but I have failed to use multicore with diversitree functions. Essentially, I cannot figure out how to tell multicore that the function takes two set of objects, one containing the likelihood functions and another one containing the parameters. Does anybody have experience and/or advice doing this? Any guidance would be most welcome. Other alternatives for splitting diversitree jobs among processors would also be highly appreciated. Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- David Bapst Dept of Geophysical Sciences University of Chicago 5734 S. Ellis Chicago, IL 60637 http://home.uchicago.edu/~dwbapst/ http://cran.r-project.org/web/packages/paleotree/index.html http://home.uchicago.edu/%7Edwbapst/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] diversitree with multicore
hi Rafael, I have recently conducted diversitree analysis in a cluster. The only way I got it to work was using the Rmpi package (I couldn't get multicore to detect cores in multiple nodes, so it was limited to the number of cores in a single node, in my case 8; with Rmpi there wasn't such limit). To pass multiple arguments, what I did was create a wrapper function that included within it the defining parameters, the likelihood function, and in my case the MCMC run. Then I just iterated this wrapper function on my trees, in parallel, using an apply-like function from the Rmpi package. If you want to see what my script looks like just let me know and I'd gladly share it with you. Abraços, Rafael Maia --- webpage: http://gozips.uakron.edu/~rm72 A little learning is a dangerous thing; drink deep, or taste not the Pierian spring. (A. Pope) Graduate Student - Integrated Bioscience University of Akron http://gozips.uakron.edu/~shawkey/ On Mar 21, 2012, at 12:38 PM, David Bapst wrote: Hey Rafael- Hey, I'm using mclapply right now! What I'm doing with it is a trick I use often in lapply and sapply statements. If I have, say a list 'A' and a vector 'B' of the same length that I want to analyze in a lapply statement which analyzes each corresponding pair, I do the following: mclapply(1:length(A),function(x) myfunctionhere(A[[x]], B[x])) So, in this way, you could put each likelihood function in a list with the corresponding starting parameters in another list of the same length and then use the code above. Also, you can also do more complex things, say if you want to end up with a nested list where you analyze each element of A with every element of B... mclapply(1:length(A),function(x) lapply(1:length(B),function(y) myfunctionhere(A[[x]], B[y])) -Dave On Wed, Mar 21, 2012 at 10:58 AM, Rafael Rubio de Casas r...@nescent.orgwrote: Dear all, I have been trying to come up with easy ways to split diversitree runs among processors when using multiple trees. It seems that the multicore package would be a good way of doing it, but I have failed to use multicore with diversitree functions. Essentially, I cannot figure out how to tell multicore that the function takes two set of objects, one containing the likelihood functions and another one containing the parameters. Does anybody have experience and/or advice doing this? Any guidance would be most welcome. Other alternatives for splitting diversitree jobs among processors would also be highly appreciated. Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- David Bapst Dept of Geophysical Sciences University of Chicago 5734 S. Ellis Chicago, IL 60637 http://home.uchicago.edu/~dwbapst/ http://cran.r-project.org/web/packages/paleotree/index.html http://home.uchicago.edu/%7Edwbapst/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] diversitree with multicore
Hi Rafael, I'd agree with RM, take a look at some of the explicit parallelization packages (I find snowfall has less overhead on my typing than Rmpi does) if you're planning to run over multiple trees. This will let you run on multiple cores or on clusters. Here's quick example based on my understanding of your question: https://github.com/cboettig/sandbox/blob/master/r-sig-phylo/parallel.md -Carl On Wed, Mar 21, 2012 at 9:06 AM, Brian O'Meara bome...@utk.edu wrote: The multicore package has been rolled into the built-in R package parallel in R =2.14.0, with slight changes. There's a task view on parallelization that might help: http://cran.r-project.org/web/views/HighPerformanceComputing.html . I've found the foreach package helpful for parallelization, too. It's especially good if you still think in terms of for loops (as I do, sadly, though I'm working on it). Brian ___ Brian O'Meara Assistant Professor Dept. of Ecology Evolutionary Biology U. of Tennessee, Knoxville http://www.brianomeara.info Students wanted: Applications due Dec. 15, annually Postdoc collaborators wanted: Check NIMBioS' website Calendar: http://www.brianomeara.info/calendars/omeara On Wed, Mar 21, 2012 at 11:58 AM, Rafael Rubio de Casas r...@nescent.org wrote: Dear all, I have been trying to come up with easy ways to split diversitree runs among processors when using multiple trees. It seems that the multicore package would be a good way of doing it, but I have failed to use multicore with diversitree functions. Essentially, I cannot figure out how to tell multicore that the function takes two set of objects, one containing the likelihood functions and another one containing the parameters. Does anybody have experience and/or advice doing this? Any guidance would be most welcome. Other alternatives for splitting diversitree jobs among processors would also be highly appreciated. Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Carl Boettiger UC Davis http://www.carlboettiger.info/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] diversitree with multicore
Hi all, I am not sure what type of diversitree analysis but there may be some memory issues if you use mcapply or the likes (this is especially true if using the mcmc version of QuaSSE). An alternative (albeit slightly more complicated) way of doing this is to write a bash script which utilizes PBS Array. Instead of parallelizing the entire script, you can then just send each tree to a different node in the cluster and then combine the markov chains afterwards. I am not sure if this is efficient for your specific problem but appears to be in general a good strategy for memory-intensive computational problmes done across a set of trees. matt On Wed, Mar 21, 2012 at 10:10 AM, Carl Boettiger cboet...@gmail.com wrote: Hi Rafael, I'd agree with RM, take a look at some of the explicit parallelization packages (I find snowfall has less overhead on my typing than Rmpi does) if you're planning to run over multiple trees. This will let you run on multiple cores or on clusters. Here's quick example based on my understanding of your question: https://github.com/cboettig/sandbox/blob/master/r-sig-phylo/parallel.md -Carl On Wed, Mar 21, 2012 at 9:06 AM, Brian O'Meara bome...@utk.edu wrote: The multicore package has been rolled into the built-in R package parallel in R =2.14.0, with slight changes. There's a task view on parallelization that might help: http://cran.r-project.org/web/views/HighPerformanceComputing.html . I've found the foreach package helpful for parallelization, too. It's especially good if you still think in terms of for loops (as I do, sadly, though I'm working on it). Brian ___ Brian O'Meara Assistant Professor Dept. of Ecology Evolutionary Biology U. of Tennessee, Knoxville http://www.brianomeara.info Students wanted: Applications due Dec. 15, annually Postdoc collaborators wanted: Check NIMBioS' website Calendar: http://www.brianomeara.info/calendars/omeara On Wed, Mar 21, 2012 at 11:58 AM, Rafael Rubio de Casas r...@nescent.org wrote: Dear all, I have been trying to come up with easy ways to split diversitree runs among processors when using multiple trees. It seems that the multicore package would be a good way of doing it, but I have failed to use multicore with diversitree functions. Essentially, I cannot figure out how to tell multicore that the function takes two set of objects, one containing the likelihood functions and another one containing the parameters. Does anybody have experience and/or advice doing this? Any guidance would be most welcome. Other alternatives for splitting diversitree jobs among processors would also be highly appreciated. Thanks in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Carl Boettiger UC Davis http://www.carlboettiger.info/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo