Re: [R] Distributed computing with R
No -- the point is that they are mostly orthogonal solutions, not mutually exclusive. Mostly implies that sometimes, 1 + 1 = 0.5 (i.e. negative interactions can happen if you do not think through what each is doing for scheduling/job transfer/migration). For example, you can use SGE, OpenMOSIX, and SNOW-on-PVM (or other message passing library) all together. SGE and OpenMOSIX might not be too happy, since they are trying to do the same thing at different levels, but it would work (perhaps inefficiently). best, -tony Paul Gilbert [EMAIL PROTECTED] writes: Tony Thanks, this categorization has cleared up a few things I have found confusing. But should I read this to mean that SNOW would not run on a system or kernel level parallel setup? Thanks, Paul Gilbert A.J. Rossini wrote: Also see SNOW (which simplifies parallel programming, sits on top of rpvm, Rmpi, or a socket-based system). Depends on whether you want parallelism on the: 1. User-level -- the libraries such as PVM, LAM-MPI, etc will help, and there are various packages which provide an API to those. 2. System-level -- then Condor, Sun Grid Engine / Maui scheduler, and similar queueing/batching/allocation daemons will help (computational grid software is usually a generalization of this which adds authentication and resource allocation). 3. Kernel-level -- then OpenMOSIX, BPROC, etc will help. They are mostly orthogonal. Mostly... :-). best, -tony Armin Roehrl [EMAIL PROTECTED] writes: If you do some programming, you might want to look at MPI. R-extensions for MPI exist (RMPI). It all depends a lot on what kind of usage you envisage of your cluster. Open-PBS is also a good batch system. Maybe you also want to look at Mosix, which is a modified linux system. Depending on what your ultimate computing ressources are, maybe also look at IBM's Globus toolkit. Parallel programming is fun. The world is inherently parallel! Ciao, -Armin. Armin Roehrl, http://www.approximity.com We manage risk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- [EMAIL PROTECTED]http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distributed computing with R
Tony Thanks, this categorization has cleared up a few things I have found confusing. But should I read this to mean that SNOW would not run on a system or kernel level parallel setup? Thanks, Paul Gilbert A.J. Rossini wrote: Also see SNOW (which simplifies parallel programming, sits on top of rpvm, Rmpi, or a socket-based system). Depends on whether you want parallelism on the: 1. User-level -- the libraries such as PVM, LAM-MPI, etc will help, and there are various packages which provide an API to those. 2. System-level -- then Condor, Sun Grid Engine / Maui scheduler, and similar queueing/batching/allocation daemons will help (computational grid software is usually a generalization of this which adds authentication and resource allocation). 3. Kernel-level -- then OpenMOSIX, BPROC, etc will help. They are mostly orthogonal. Mostly... :-). best, -tony Armin Roehrl [EMAIL PROTECTED] writes: If you do some programming, you might want to look at MPI. R-extensions for MPI exist (RMPI). It all depends a lot on what kind of usage you envisage of your cluster. Open-PBS is also a good batch system. Maybe you also want to look at Mosix, which is a modified linux system. Depending on what your ultimate computing ressources are, maybe also look at IBM's Globus toolkit. Parallel programming is fun. The world is inherently parallel! Ciao, -Armin. Armin Roehrl, http://www.approximity.com We manage risk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distributed computing with R
snow works well on an openMosix system, and is actually quite convenient since you don't have to worry about which process is going to which computer. The kernel migrates the processes automatically (usually). -roger Paul Gilbert wrote: Tony Thanks, this categorization has cleared up a few things I have found confusing. But should I read this to mean that SNOW would not run on a system or kernel level parallel setup? Thanks, Paul Gilbert A.J. Rossini wrote: Also see SNOW (which simplifies parallel programming, sits on top of rpvm, Rmpi, or a socket-based system). Depends on whether you want parallelism on the: 1. User-level -- the libraries such as PVM, LAM-MPI, etc will help, and there are various packages which provide an API to those. 2. System-level -- then Condor, Sun Grid Engine / Maui scheduler, and similar queueing/batching/allocation daemons will help (computational grid software is usually a generalization of this which adds authentication and resource allocation). 3. Kernel-level -- then OpenMOSIX, BPROC, etc will help. They are mostly orthogonal. Mostly... :-). best, -tony Armin Roehrl [EMAIL PROTECTED] writes: If you do some programming, you might want to look at MPI. R-extensions for MPI exist (RMPI). It all depends a lot on what kind of usage you envisage of your cluster. Open-PBS is also a good batch system. Maybe you also want to look at Mosix, which is a modified linux system. Depending on what your ultimate computing ressources are, maybe also look at IBM's Globus toolkit. Parallel programming is fun. The world is inherently parallel! Ciao, -Armin. Armin Roehrl, http://www.approximity.com We manage risk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Distributed computing with R
Soraj, Haven't had any experience with R in a distributed computing environment, but have used Condor (http://www.cs.wisc.edu/condor/) with several applications. It's free, has a public license for use, is easy to use, but is not open source. Since you can batch up R code, this might be a reasonable option. HTH steve -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Saroj Mohapatra Sent: Wednesday, June 02, 2004 7:12 PM To: [EMAIL PROTECTED] Subject: [R] Distributed computing with R Dear all, We have started using R for data analysis since a few months and find it useful. We are planning to acquire a high-end dedicated system for microarray data analysis and thinking of a distributed environment. I would appreciate if some one could send some pointers regarding how to choose a proper hardware configuration, software (R or other software, esp. MATLAB), issues on setting up the cluster, etc. Has anyone here some experience of R on a cluster? Does it provide significant benefits as regards processing time? Is setting up the cluster more difficult than using R on it? Thanks. Saroj K Mohapatra, MD Research Associate Tainsky Lab Karmanos Cancer Institute Wayne State University School of Medicine 110 E. Warren, Room 311 Detroit MI 48201 313-833-0715 x2424 [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distributed computing with R
If you do some programming, you might want to look at MPI. R-extensions for MPI exist (RMPI). It all depends a lot on what kind of usage you envisage of your cluster. Open-PBS is also a good batch system. Maybe you also want to look at Mosix, which is a modified linux system. Depending on what your ultimate computing ressources are, maybe also look at IBM's Globus toolkit. Parallel programming is fun. The world is inherently parallel! Ciao, -Armin. Armin Roehrl, http://www.approximity.com We manage risk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distributed computing with R
Also see SNOW (which simplifies parallel programming, sits on top of rpvm, Rmpi, or a socket-based system). Depends on whether you want parallelism on the: 1. User-level -- the libraries such as PVM, LAM-MPI, etc will help, and there are various packages which provide an API to those. 2. System-level -- then Condor, Sun Grid Engine / Maui scheduler, and similar queueing/batching/allocation daemons will help (computational grid software is usually a generalization of this which adds authentication and resource allocation). 3. Kernel-level -- then OpenMOSIX, BPROC, etc will help. They are mostly orthogonal. Mostly... :-). best, -tony Armin Roehrl [EMAIL PROTECTED] writes: If you do some programming, you might want to look at MPI. R-extensions for MPI exist (RMPI). It all depends a lot on what kind of usage you envisage of your cluster. Open-PBS is also a good batch system. Maybe you also want to look at Mosix, which is a modified linux system. Depending on what your ultimate computing ressources are, maybe also look at IBM's Globus toolkit. Parallel programming is fun. The world is inherently parallel! Ciao, -Armin. Armin Roehrl, http://www.approximity.com We manage risk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- [EMAIL PROTECTED]http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distributed computing with R
I would suggest installing PVM or LAM-MPI and using the R packages `snow' and `rpvm' (or `Rmpi'). I've found the `snow' package very simple to use and useful for quick and dirty solutions. I've used `snow' with an openMosix setup and on a simple cluster of workstations without any scheduler. openMosix is nice because you don't have to worry about which process goes where but that's not to say it doesn't have its own difficulties. Overall, my experience with parallel computing in R has been a little clunky but that's mostly because the problems I work on don't benefit much from such a setup. -roger Saroj Mohapatra wrote: Dear all, We have started using R for data analysis since a few months and find it useful. We are planning to acquire a high-end dedicated system for microarray data analysis and thinking of a distributed environment. I would appreciate if some one could send some pointers regarding how to choose a proper hardware configuration, software (R or other software, esp. MATLAB), issues on setting up the cluster, etc. Has anyone here some experience of R on a cluster? Does it provide significant benefits as regards processing time? Is setting up the cluster more difficult than using R on it? Thanks. Saroj K Mohapatra, MD Research Associate Tainsky Lab Karmanos Cancer Institute Wayne State University School of Medicine 110 E. Warren, Room 311 Detroit MI 48201 313-833-0715 x2424 [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html