Re: [R] snow's makeCluster hanging (using Rmpi)
On Tuesday 07 November 2006 19:28, Randall C Johnson [Contr.] wrote: On 11/7/06 11:28 AM, Ramon Diaz-Uriarte [EMAIL PROTECTED] wrote: On Tuesday 07 November 2006 15:56, Randall C Johnson [Contr.] wrote: Hello everyone, I've been fiddling around with the snow and Rmpi packages on my new Intel Mac, and have run into a few problems. When I make a cluster on my machine, both slaves start up just fine, and everything works as expected. When I try to make a cluster including another networked machine it hangs. I've followed the suggestions at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83086.html and http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html but to no avail. Everything seems to start up fine using lamboot, but then hangs when making the cluster in R. Making a cluster with 2 slaves seems to work fine, but if I increase the number (to use the networked machines) it hangs again. I've tried networking to another Mac, and also to a machine running Red Hat Linux. Both machines can set up their own local clusters. Does anyone have any ideas? Dear Randy, A few suggestions: a) make sure there are no firewalls; I assume this is actually the case, but anyway; I don't think I have any firewalls running. I checked and they all seem to be disabled... you can use (under GNU/Linux at least) the command (as root) iptables -L If there are no iptables-based firewall you should see something like: Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Make sure this is OK in all the machines. b) what happens if you lamboot outside R (and create a universe with a local and a networked machine) and then you do: lamexec -np 6 hostname? This prints out the host names of each machine as expected. OK, so its not lam itself (so a) is probably unneeded). c) are the Rmpi and snow installed in the same directories in the different machines? are there version differences in Rmpi (or Snow) between machines? I've installed the same versions, but they are in different directories... I think I remember that having Rmpi and Snow in different directories tended to cause problems. Now, I always place them in the same directory. I think that some sh Rmpi script looks for other scripts, and if they are not where it expect thems, it fails. I also tried an example per Luke Tierney's suggestion using only Rmpi, and I get the following error when trying to spawn the Rslaves after starting up with lamboot (outside of R). I tried to use laminfo, but I'm not sure what I'm looking for or how to use the information given... library(Rmpi) mpi.spawn.Rslaves() --- - It seems that [at least] one of the child processes that was started by MPI_Comm_spawn* chose a different RPI than the parent MPI application. For example, one (of the) child process(es) that differed from the parent is shown below: Parent application: MPI_Comm_spawn Child MPI_COMM_WORLD rank usysv (v7.1.0): 0 All MPI processes must choose the same RPI module and version when they start. Check your SSI settings and/or the local environment variables on each node. --- - R(26444) malloc: *** Deallocation of a pointer not malloced: 0x16379a0; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug Error in mpi.comm.spawn(slave = system.file(Rslaves.sh, package = Rmpi), MPI_Error_string: unclassified Now that is way over my head. A few things I'd check: Are you mixing 32-bit with 64-bit machines? (I've done that in the past, x86 and x86_64, without apparent problems, but I've never used Macs for this). Can you try using two different machines with the same architecture? What about gcc compilers: are you using very different versions on each machine? Best, R. HTH, R. Thanks, Randy sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.2 locale: C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: Rmpisnow 0.5-3 0.2-2 -- Ramón Díaz-Uriarte Bioinformatics Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __
[R] snow's makeCluster hanging (using Rmpi)
Hello everyone, I've been fiddling around with the snow and Rmpi packages on my new Intel Mac, and have run into a few problems. When I make a cluster on my machine, both slaves start up just fine, and everything works as expected. When I try to make a cluster including another networked machine it hangs. I've followed the suggestions at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83086.html and http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html but to no avail. Everything seems to start up fine using lamboot, but then hangs when making the cluster in R. Making a cluster with 2 slaves seems to work fine, but if I increase the number (to use the networked machines) it hangs again. I've tried networking to another Mac, and also to a machine running Red Hat Linux. Both machines can set up their own local clusters. Does anyone have any ideas? Thanks, Randy sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.2 locale: C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: Rmpisnow 0.5-3 0.2-2 ~~ Randall C Johnson Bioinformatics Analyst SAIC-Frederick, Inc (Contractor) Laboratory of Genomic Diversity NCI-Frederick, P.O. Box B Bldg 560, Rm 11-85 Frederick, MD 21702 Phone: (301) 846-1304 Fax: (301) 846-1686 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow's makeCluster hanging (using Rmpi)
On Tuesday 07 November 2006 15:56, Randall C Johnson [Contr.] wrote: Hello everyone, I've been fiddling around with the snow and Rmpi packages on my new Intel Mac, and have run into a few problems. When I make a cluster on my machine, both slaves start up just fine, and everything works as expected. When I try to make a cluster including another networked machine it hangs. I've followed the suggestions at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83086.html and http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html but to no avail. Everything seems to start up fine using lamboot, but then hangs when making the cluster in R. Making a cluster with 2 slaves seems to work fine, but if I increase the number (to use the networked machines) it hangs again. I've tried networking to another Mac, and also to a machine running Red Hat Linux. Both machines can set up their own local clusters. Does anyone have any ideas? Dear Randy, A few suggestions: a) make sure there are no firewalls; I assume this is actually the case, but anyway; b) what happens if you lamboot outside R (and create a universe with a local and a networked machine) and then you do: lamexec -np 6 hostname? c) are the Rmpi and snow installed in the same directories in the different machines? are there version differences in Rmpi (or Snow) between machines? HTH, R. Thanks, Randy sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.2 locale: C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: Rmpisnow 0.5-3 0.2-2 ~~ Randall C Johnson Bioinformatics Analyst SAIC-Frederick, Inc (Contractor) Laboratory of Genomic Diversity NCI-Frederick, P.O. Box B Bldg 560, Rm 11-85 Frederick, MD 21702 Phone: (301) 846-1304 Fax: (301) 846-1686 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ramón Díaz-Uriarte Bioinformatics Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow's makeCluster hanging (using Rmpi)
The most likely culprit is firewall settings. Something like tcpdump may help to confirm that. Working with a stand-alone example from Rmpi may also help. Best, luke On Tue, 7 Nov 2006, Randall C Johnson [Contr.] wrote: Hello everyone, I've been fiddling around with the snow and Rmpi packages on my new Intel Mac, and have run into a few problems. When I make a cluster on my machine, both slaves start up just fine, and everything works as expected. When I try to make a cluster including another networked machine it hangs. I've followed the suggestions at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83086.html and http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html but to no avail. Everything seems to start up fine using lamboot, but then hangs when making the cluster in R. Making a cluster with 2 slaves seems to work fine, but if I increase the number (to use the networked machines) it hangs again. I've tried networking to another Mac, and also to a machine running Red Hat Linux. Both machines can set up their own local clusters. Does anyone have any ideas? Thanks, Randy sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.2 locale: C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: Rmpisnow 0.5-3 0.2-2 ~~ Randall C Johnson Bioinformatics Analyst SAIC-Frederick, Inc (Contractor) Laboratory of Genomic Diversity NCI-Frederick, P.O. Box B Bldg 560, Rm 11-85 Frederick, MD 21702 Phone: (301) 846-1304 Fax: (301) 846-1686 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow's makeCluster hanging (using Rmpi)
On 11/7/06 11:28 AM, Ramon Diaz-Uriarte [EMAIL PROTECTED] wrote: On Tuesday 07 November 2006 15:56, Randall C Johnson [Contr.] wrote: Hello everyone, I've been fiddling around with the snow and Rmpi packages on my new Intel Mac, and have run into a few problems. When I make a cluster on my machine, both slaves start up just fine, and everything works as expected. When I try to make a cluster including another networked machine it hangs. I've followed the suggestions at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83086.html and http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html but to no avail. Everything seems to start up fine using lamboot, but then hangs when making the cluster in R. Making a cluster with 2 slaves seems to work fine, but if I increase the number (to use the networked machines) it hangs again. I've tried networking to another Mac, and also to a machine running Red Hat Linux. Both machines can set up their own local clusters. Does anyone have any ideas? Dear Randy, A few suggestions: a) make sure there are no firewalls; I assume this is actually the case, but anyway; I don't think I have any firewalls running. I checked and they all seem to be disabled... b) what happens if you lamboot outside R (and create a universe with a local and a networked machine) and then you do: lamexec -np 6 hostname? This prints out the host names of each machine as expected. c) are the Rmpi and snow installed in the same directories in the different machines? are there version differences in Rmpi (or Snow) between machines? I've installed the same versions, but they are in different directories... I also tried an example per Luke Tierney's suggestion using only Rmpi, and I get the following error when trying to spawn the Rslaves after starting up with lamboot (outside of R). I tried to use laminfo, but I'm not sure what I'm looking for or how to use the information given... library(Rmpi) mpi.spawn.Rslaves() It seems that [at least] one of the child processes that was started by MPI_Comm_spawn* chose a different RPI than the parent MPI application. For example, one (of the) child process(es) that differed from the parent is shown below: Parent application: MPI_Comm_spawn Child MPI_COMM_WORLD rank usysv (v7.1.0): 0 All MPI processes must choose the same RPI module and version when they start. Check your SSI settings and/or the local environment variables on each node. R(26444) malloc: *** Deallocation of a pointer not malloced: 0x16379a0; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug Error in mpi.comm.spawn(slave = system.file(Rslaves.sh, package = Rmpi), : MPI_Error_string: unclassified HTH, R. Thanks, Randy sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i386-apple-darwin8.8.2 locale: C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: Rmpisnow 0.5-3 0.2-2 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.