On 4 October 2007 at 06:37, Luke Tierney wrote: | > Yes, my bad. But it also hangs with argument count=3 (which I had tried, but | > my mail was wrong.) | | Any chance the snow workers are picking up another version of Rmpi, eg | a LAM one? Might happen if you have R_SNOW_LIB set and a Rmpi | installed there. Otherwise starting with outfile=something may help. | Let me know what you find out -- I'd like to make the snow | configuration process more bullet-proof.
I generally don;t have any environment variables, so not sure. I'll try to see what I can find. | > | count=mpi.comm.size(0)-1 is used. If you start R alone, this will return | > | count=0 since there is only one member (master). I do not know why snow | > | did not use count=mpi.universe.size()-1 to find total nodes available. | > | > How would it know total nodes ? See below re hostfile. | > | > | Anyway after using | > | cl=makeMPIcluster(count=3), | > | I was able to run parApply function. | > | | > | I tried | > | R -> library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3) | > | | > | Also | > | mpirun -host hostfile -np 1 R --no-save | > | library(Rmpi) -> library(snow) -> c1=makeMPIcluster(3) | > | | > | Hao | > | | > | PS: hostfile contains all nodes info so in R mpi.universe.size() returns | > | right number and will spawn to remote nodes. | > | > So we depend on a correct hostfile ? As I understand the Open MPI this is | > deprecated: | > | > # This is the default hostfile for Open MPI. Notice that it does not | > # contain any hosts (not even localhost). This file should only | > # contain hosts if a system administrator wants users to always have | > # the same set of default hosts, and is not using a batch scheduler | > # (such as SLURM, PBS, etc.). | > | > I am _very_ interested in running Open MPI and Rmpi under slurm (which we | > added to Debian as source package slurm-llnl) so it would be nice if this | > could rewritten to not require a hostfile as this seems to be how upstream is | > going. | | To work better with batch scheduling environments where spawning might | be techncally or politically problematic I have been trying to improve | the RMPISNOW script that can be used with LAM as | | mpirun -np 3 RMPISNOW | | and then either | | cl <- makeCluster() # no argument | | or | | cl <- makeCluster(2) # mpi rank - 1 (or less I believe) | | (the default type for makeCluster becomes MPI in this case). This | seems to work reasonably well in LAM and I think I can get it to work | similarly in OpenMPI -- will try in the next day or so. Both LAM and | OpenMPI provide environment variables so shell scripts can determine | the mpirank, which is useful for getting --slave and output redirect | to the workers. I haven't figured out anything analogous for | MPIC/MPICH2 yet. Yes, out of a run I also realized that I can't just ask Rmpi to work without a hostfile -- the info must come from somewhere. That said, it still fails with a minimal slurm example using the srun. Ie [EMAIL PROTECTED]:~> cat /tmp/rmpi.r #!/usr/bin/env r library(Rmpi) library(snow) cl <- makeMPIcluster(count=1) print("Hello\n") does not make it through makeMPIcluster either and just hangs if I do: [EMAIL PROTECTED]:~> srun -N 1 /tmp/rmpi.r Dirk -- Three out of two people have difficulties with fractions. ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel