On Apr 7, 2007, at 10:56 AM, Ramon Diaz-Uriarte wrote: > Dear All, > > The "clients.txt" file of the latest Rserve package, by Simon > Urbanek, says, regarding its R client, > > "(...) a simple R client, i.e. it allows you to connect to Rserve > from R itself. It is very simple and limited, because Rserve was > not primarily meant for R-to-R communication (there are better ways > to do that), but it is useful for quick interactive connection to > an Rserve farm." > > Which are those better ways to do it? I am thinking about using > Rserve to have an R process send jobs to a bunch of Rserves in > different machines. It is like what we could do with Rmpi (or pvm), > but without the MPI layer. Therefore, presumably it'd be easier to > deal with network problems, machine's failures, using checkpoints, > etc. (i.e., to try to get better fault tolerance). > > It seems that Rserve would provide the basic infrastructure for > doing that and saves me from reinventing the wheel of using > sockets, etc, directly from R. > > However, Simon's comment about better ways of R-to-R communication > made me wonder if this idea really makes sense. What is the catch? > Have other people tried similar approaches? >
I was commenting on direct R-to-R communication using sockets + 'serialize' in R or the 'snow' package for parallel processing. The latter could be useful for what you have in mind, because it includes a socket-based implementation which allows you to spawn multiple children (across multiple machines) and collect their results. It uses regular rsh or ssh to start the jobs, so if can use that, it should work for you. 'snow' also has PVM and MPI implementations, the PVM one is really easy to setup (on unix) and that was what I was using for parallel computing in R on a cluster. Rserve is sort of comparable, but in addition it provides the spawning infrastructure due to its client/server concept. What it doesn't have is the convenience functions that snow provides like clusterApply etc. Thinking of it, it would be actually possible to add them, although I admit that the original goal of Rserve was not parallel computing :). The idea was to have one Rserve server and multiple clients whereas in 'snow' you sort of have one client and multiple servers. You could spawn multiple Rserves on multiple machines, but Rserve itself doesn't provide any load-balancing out of the box, so you'd have to do that yourself. I don't know if that helps... :) Cheers, Simon ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
