Re: [R] Could concurrent R sessions mix up variables?

2010-12-11 Thread Tal Galili
Hello Anthony,
Since you are working on Windows, also consider using the "doSMP" with the
"foreach" package.
Although it is not fully GPL as many of us would have preferred (e.g: doSMP
is not on CRAN), it is still freely available (with source code and all) to
download (even without the need for "REvolution R").
See my post here:
http://www.r-statistics.com/2010/04/parallel-multicore-processing-with-r-on-windows/
On how to use them for parallel multicore processing.

My previous attempts at using the "snowfall" package with multicore has
failed.  But it could be that I have missed something, or that the package
was updated in meantime - so I would suggest trying both.
The multicore package is not available for windows.

Cheers,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Dec 10, 2010 at 8:58 PM, Phil Spector wrote:

> Anthony -
>   I would advise you to use the multicore or snowfall packages
> to utilize multiple CPUs.  As an example using multicore:
>
>  library(multicore)
>> sim = function(mu)max(replicate(10,max(rnorm(100,mu
>> library(multicore)
>> unlist(mclapply(c(1,5,10,20),sim))
>>
> [1]  6.569332 10.268091 15.335847 25.291502
>
> Using snowfall:
>
>  library(snowfall)
>> sim = function(mu)max(replicate(10,max(rnorm(100,mu
>> sfInit(cpus=4,type='SOCK',parallel=TRUE)
>> sfSapply(c(1,5,10,20),sim)
>>
> [1]  6.200161 10.307807 15.271581 25.055950
>
> Hope this helps.
>
>- Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spec...@stat.berkeley.edu
>
>
>
>
>
>
>
> On Fri, 10 Dec 2010, Anthony Damico wrote:
>
>  Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32.
>>
>> I'm experiencing a strange problem in R that I'm not even sure how to
>> begin to fix.
>>
>> I've got a huge (forty-pages printed) simulation written in R that I'd
>> like to run multiple times.  When I open up R and run it on its own,
>> it works fine.  At the beginning of the program, there's a variable X
>> that I set to 1, 5, 10, 20, depending on how sensitive I want the
>> simulation to be to a certain parameter.  When I just run one instance
>> of R, the X variable stays the same throughout the program.
>>
>> I have a quad-core machine, so I'd like to take advantage of all four
>> processors.
>>
>> If I open up four sessions and set X to 1, 5, 10, and 20 in those
>> different sessions, then run all four simulations all the way through
>> (about eighteen hours of processing time) at the same time, the
>> variable X ends up being 20 at the end of all four sessions.  It's as
>> if R mixed up the variable setting between the four concurrent
>> sessions.  I can't figure out why else my variable X would ever get
>> changed to 20 in the three simulations that I set it to 1, 5, and 10,
>> respeectively (it doesn't get updated anywhere during the simulation).
>>
>> When I have all four of these simulations running concurrently, I am
>> absolutely maxing out my computer.  All four processors are at 100%,
>> and my Windows Task Manager says I'm using almost 100% of my 16 GB of
>> RAM.  Is it possible that intense resource use would cause a variable
>> conflict like this?  I have no idea where to start troubleshooting
>> this error, so any advice would be appreciated.
>>
>> Thanks!
>>
>> Anthony Damico
>> Kaiser Family Foundation
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Could concurrent R sessions mix up variables?

2010-12-10 Thread Phil Spector

Anthony -
   I would advise you to use the multicore or snowfall packages
to utilize multiple CPUs.  As an example using multicore:


library(multicore)
sim = function(mu)max(replicate(10,max(rnorm(100,mu
library(multicore)
unlist(mclapply(c(1,5,10,20),sim))

[1]  6.569332 10.268091 15.335847 25.291502

Using snowfall:


library(snowfall)
sim = function(mu)max(replicate(10,max(rnorm(100,mu
sfInit(cpus=4,type='SOCK',parallel=TRUE)
sfSapply(c(1,5,10,20),sim)

[1]  6.200161 10.307807 15.271581 25.055950

Hope this helps.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu






On Fri, 10 Dec 2010, Anthony Damico wrote:


Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32.

I'm experiencing a strange problem in R that I'm not even sure how to
begin to fix.

I've got a huge (forty-pages printed) simulation written in R that I'd
like to run multiple times.  When I open up R and run it on its own,
it works fine.  At the beginning of the program, there's a variable X
that I set to 1, 5, 10, 20, depending on how sensitive I want the
simulation to be to a certain parameter.  When I just run one instance
of R, the X variable stays the same throughout the program.

I have a quad-core machine, so I'd like to take advantage of all four
processors.

If I open up four sessions and set X to 1, 5, 10, and 20 in those
different sessions, then run all four simulations all the way through
(about eighteen hours of processing time) at the same time, the
variable X ends up being 20 at the end of all four sessions.  It's as
if R mixed up the variable setting between the four concurrent
sessions.  I can't figure out why else my variable X would ever get
changed to 20 in the three simulations that I set it to 1, 5, and 10,
respeectively (it doesn't get updated anywhere during the simulation).

When I have all four of these simulations running concurrently, I am
absolutely maxing out my computer.  All four processors are at 100%,
and my Windows Task Manager says I'm using almost 100% of my 16 GB of
RAM.  Is it possible that intense resource use would cause a variable
conflict like this?  I have no idea where to start troubleshooting
this error, so any advice would be appreciated.

Thanks!

Anthony Damico
Kaiser Family Foundation

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Could concurrent R sessions mix up variables?

2010-12-10 Thread Duncan Murdoch

On 10/12/2010 1:13 PM, Anthony Damico wrote:

Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32.

I'm experiencing a strange problem in R that I'm not even sure how to
begin to fix.

I've got a huge (forty-pages printed) simulation written in R that I'd
like to run multiple times.  When I open up R and run it on its own,
it works fine.  At the beginning of the program, there's a variable X
that I set to 1, 5, 10, 20, depending on how sensitive I want the
simulation to be to a certain parameter.  When I just run one instance
of R, the X variable stays the same throughout the program.

I have a quad-core machine, so I'd like to take advantage of all four
processors.

If I open up four sessions and set X to 1, 5, 10, and 20 in those
different sessions, then run all four simulations all the way through
(about eighteen hours of processing time) at the same time, the
variable X ends up being 20 at the end of all four sessions.  It's as
if R mixed up the variable setting between the four concurrent
sessions.  I can't figure out why else my variable X would ever get
changed to 20 in the three simulations that I set it to 1, 5, and 10,
respeectively (it doesn't get updated anywhere during the simulation).

When I have all four of these simulations running concurrently, I am
absolutely maxing out my computer.  All four processors are at 100%,
and my Windows Task Manager says I'm using almost 100% of my 16 GB of
RAM.  Is it possible that intense resource use would cause a variable
conflict like this?  I have no idea where to start troubleshooting
this error, so any advice would be appreciated.



If you are running something that takes 18 hours to complete, a common 
practice is to save intermediate results to disk occasionally.  Have you 
(or whoever wrote the simulation) done this and forgotten about it?  If 
all 4 processes are saving to the same place, then reading results back, 
you'd see something like you describe.


If all calculations are held in memory, you shouldn't.

A simple approach that might debug this is to create a new variables 
initX, set equal to X.  Then sprinkle statements


stopifnot(X == initX)

through your simulation code.  That should quit when the change happens, 
and you can try to figure out why it happened.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Could concurrent R sessions mix up variables?

2010-12-10 Thread Anthony Damico
Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32.

I'm experiencing a strange problem in R that I'm not even sure how to
begin to fix.

I've got a huge (forty-pages printed) simulation written in R that I'd
like to run multiple times.  When I open up R and run it on its own,
it works fine.  At the beginning of the program, there's a variable X
that I set to 1, 5, 10, 20, depending on how sensitive I want the
simulation to be to a certain parameter.  When I just run one instance
of R, the X variable stays the same throughout the program.

I have a quad-core machine, so I'd like to take advantage of all four
processors.

If I open up four sessions and set X to 1, 5, 10, and 20 in those
different sessions, then run all four simulations all the way through
(about eighteen hours of processing time) at the same time, the
variable X ends up being 20 at the end of all four sessions.  It's as
if R mixed up the variable setting between the four concurrent
sessions.  I can't figure out why else my variable X would ever get
changed to 20 in the three simulations that I set it to 1, 5, and 10,
respeectively (it doesn't get updated anywhere during the simulation).

When I have all four of these simulations running concurrently, I am
absolutely maxing out my computer.  All four processors are at 100%,
and my Windows Task Manager says I'm using almost 100% of my 16 GB of
RAM.  Is it possible that intense resource use would cause a variable
conflict like this?  I have no idea where to start troubleshooting
this error, so any advice would be appreciated.

Thanks!

Anthony Damico
Kaiser Family Foundation

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.