Moreno, As much of my processor time is often spent doing basic linear algebra operations (matrix inversion, quadratic programming, etc), I recently recompiled R using a BLAS implementation (ATLAS) tuned for parallel processing. The speed improvement for linear algebra operations was significant on multi-processors.
For example, using: system.time(x <- replicate(10, matrix(rnorm(N^2), N, N) %*% matrix(rnorm(N^2), N, N))) I benchmarked speed improvements of 10-20% where N is small (10-100) and speed improvements of up to 6x (e.g. 8 seconds vs 48 seconds) when N is large (1000+). So for users with lots of linear algebra calculations interested in parallel processing, I'd recommend always starting with (re-)compiling a customized BLAS, if they have not done so already. ATLAS and GOTO are the two most common BLAS implementations that I know of. As far as true parallel processing, I have not yet tried the before-mentioned R packages, but I did code up an internal package for parallel processing very large simulations in which a simple script is re-run on multiple data sets. In this example I stored each data set in a different numbered directory. The R script would go through each directory, in order, looking for a flag.txt file. If such a file does not exist, the processor puts a flag.txt in that directory, indicating that that directory is in use, and starts processing the data. This allows multiple processors/computers to work on very large simulations in parallel without duplicating work. At one point I was able to muster up 15-20 CPUs from spare Windows and Linux boxes to reduce the simulation time down from days to hours. Such a system would be also be easy to re-create without setting up MPI/PVM if your simulation / project can be divided up in a similar way. Cheers, Robert -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Martin Morgan Sent: Thursday, May 25, 2006 1:17 PM To: [EMAIL PROTECTED] Cc: r-help Subject: Re: [R] parallel computing Hi Moreno -- snow provides an easy interface to simple parallel types of calculations (e.g., lapply in parallel). I quickly wanted to have more direct control over how parallel computations were calculated, and have been using Rmpi. Though in principle snow and Rmpi are 'easy' to use, I found that they actually require a certain amount of understanding about R objects and evaluation, and the underlying communication library (MPI, or PVM). Hope that helps, Martin "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes: > Dear R users, > > I have access to a Sun cluster with multiple processors , a lot of > RAM and with RedHat installed. I want to take advantage of its > power for a R routine very time consuming. > > Whick package do I have to use? I know there are snow,snowFT and > others package.Which is the best for my purpose? Do someone have > experiences with this? > > Thanck in advance. > > Moreno > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
