[R] random number generator in batch jobs
Dear sir, I want to submit R batch jobs (e.g. 5) under the linux cluster by the script file do_mul. The script file do_mul #!/bin/bash export var for var in $(seq 1 5) do qsub -v var do_test done exit 0 Through do_mul, 5 do_test script files are submitted to the cluster. The script file do_test: #!/bin/bash -l #PBS -l ncpus=1 #PBS -l walltime=0:05:00 cd $PBS_O_WORKDIR mkdir test$var cd test$var module load R/2.5.0 R --vanilla test exit 0 The content in R file test is : rm(list=ls(all=TRUE)) sample(10) I expect to have different samples each time. However, for these 5 replications, the first 3 jobs giving me the same samples and the last 2 are the same. I'm confused because I already used R --vanilla to avoid loading same workspace each time and rm(list=ls(all=TRUE)) to remove the same random seed each time. Why do same samples still happen among 5 replications? Does anybody have some ideas to solve this problem? Looking forward to your reply, thanks. Regards, Jiqiu Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random number generator in batch jobs
Have you read the help page? Initially, there is no seed; a new one is created from the current time when one is required. Hence, different sessions will give different simulation results, by default. Thus if you choose to launch processes on different machines at the same time you will get the same random number stream. Running random number streams for parallel computation is a (very) specialized topic and you need to be aware of the literature. I will point out packages rsprng and accuracy (function runifS). On Mon, 30 Jul 2007, Jiqiu Cheng wrote: Dear sir, I want to submit R batch jobs (e.g. 5) under the linux cluster by the script file do_mul. The script file do_mul #!/bin/bash export var for var in $(seq 1 5) do qsub -v var do_test done exit 0 Through do_mul, 5 do_test script files are submitted to the cluster. The script file do_test: #!/bin/bash -l #PBS -l ncpus=1 #PBS -l walltime=0:05:00 cd $PBS_O_WORKDIR mkdir test$var cd test$var module load R/2.5.0 R --vanilla test exit 0 The content in R file test is : rm(list=ls(all=TRUE)) sample(10) I expect to have different samples each time. However, for these 5 replications, the first 3 jobs giving me the same samples and the last 2 are the same. I'm confused because I already used R --vanilla to avoid loading same workspace each time and rm(list=ls(all=TRUE)) to remove the same random seed each time. Why do same samples still happen among 5 replications? Does anybody have some ideas to solve this problem? Looking forward to your reply, thanks. Regards, Jiqiu Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Number Generator of Park and Miller
On Tue, 24 Apr 2007, gracezhang wrote: Hi, I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? rng.lcg-function(x,p1=16807,p2=0,N=2147483647){(x*p1+p2)%%N} Dave -- Dr. David Forrest [EMAIL PROTECTED](804)684-7900w [EMAIL PROTECTED] (804)642-0662h http://maplepark.com/~drf5n/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Number Generator of Park and Miller
Hi, I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? Thanks, Grace -- View this message in context: http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3642535.html#a10172789 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Number Generator of Park and Miller
Hi, I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? Thanks, Grace -- View this message in context: http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636501.html#a10154554 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Number Generator of Park and Miller
Hi, I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? Thanks, Grace -- View this message in context: http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154558 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Number Generator of Park and Miller
gracezhang wrote: Hi, I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? I failed too. However, here is the source code http://www.firstpr.com.au/dsp/rand31/ which can be either easily rewritten to R or compiled to the shared library, providing functions, callable from R -- View this message in context: http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154560 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Number Generator of Park and Miller
By the way, AFAIK, R uses the Mersenne-Twister random number generator, which has a much better reputation for producing numbers than any linear congruential PRNG (the same url, http://www.firstpr.com.au/dsp/rand31/) gracezhang wrote: I failed to search for R package providing random number generator of Park and Miller. Anyone know any R package supporting this kind of function? -- View this message in context: http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154562 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random number generator in R compared
I am trying to generate survival data using R .Im trying to randomly generate a column of 1s and 0 and another column randomly generated using an exponential distribution but l cant seem to get the random function. how do l go about it thanks in advance rt chiruka - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random number generator in R compared
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of raymond chiruka Sent: Thursday, April 19, 2007 5:23 AM To: R-help@stat.math.ethz.ch Subject: Re: [R] Random number generator in R compared I am trying to generate survival data using R .Im trying to randomly generate a column of 1s and 0 and another column randomly generated using an exponential distribution but l cant seem to get the random function. how do l go about it thanks in advance rt chiruka To get your random column of 0's and 1's try ?sample. y - sample(c(0,1), size=100, replace=TRUE) For your random sample from an exponential distribution try ?rexp x - rexp(100) Hope this is helpful, Dan Dan Nordlund Bothell, WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random number generator: same seed used in different sessions
I did several simulation sessions and the result turned out to be a surprise. After some investigation, I found that different R sessions of the program used the same seed. Simply, in R210, if I start R and type rnorm(1), I always get the same random number. This is contradictary to what is in the R document Initially, there is no seed; a new one is created from the current time when one is required. Hence, different sessions will give different simulation results, by default. I just installed the development version R220. Different sessions of R do use different seeds as expected. Jason Liao, http://www.geocities.com/jg_liao Dept. of Biostatistics, http://www2.umdnj.edu/bmtrxweb University of Medicine and Dentistry of New Jersey 683 Hoes Lane West, Piscataway NJ 08854 phone 732-235-5429, School of Public Health office phone 732-235-9824, Cancer Institute of New Jersey office __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] random number generator: same seed used in different sessions
On Wed, 15 Jun 2005, Jason Liao wrote: I did several simulation sessions and the result turned out to be a surprise. After some investigation, I found that different R sessions of the program used the same seed. Simply, in R210, if I start R and type rnorm(1), I always get the same random number. This is contradictary to what is in the R document Initially, there is no seed; a new one is created from the current time when one is required. Hence, different sessions will give different simulation results, by default. That is not a contradiction! Did you ensure that this really was `initially', so there is no saved workspace, no .Rprofile etc? See if starting with R --vanilla makes a difference. I just installed the development version R220. Different sessions of R do use different seeds as expected. There are no R versions R210 or R220: please do read the posting guide as we ask: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] random number generator: same seed used in different sessions
Thank you, Prof. Ripley. Yes, it turned out that different R sessions (batch mode) loaded the same work space. Inserting rm(list=ls(all=TRUE)) on top of the code solves the problem. --- Prof Brian Ripley [EMAIL PROTECTED] wrote: On Wed, 15 Jun 2005, Jason Liao wrote: I did several simulation sessions and the result turned out to be a surprise. After some investigation, I found that different R sessions of the program used the same seed. Simply, in R210, if I start R and type rnorm(1), I always get the same random number. This is contradictary to what is in the R document Initially, there is no seed; a new one is created from the current time when one is required. Hence, different sessions will give different simulation results, by default. That is not a contradiction! Did you ensure that this really was `initially', so there is no saved workspace, no .Rprofile etc? See if starting with R --vanilla makes a difference. I just installed the development version R220. Different sessions of R do use different seeds as expected. There are no R versions R210 or R220: please do read the posting guide as we ask: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Jason Liao, http://www.geocities.com/jg_liao Dept. of Biostatistics, http://www2.umdnj.edu/bmtrxweb University of Medicine and Dentistry of New Jersey 683 Hoes Lane West, Piscataway NJ 08854 phone 732-235-5429, School of Public Health office phone 732-235-9824, Cancer Institute of New Jersey office __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] random number generator?
Well if the base generator is to be changed, it might be a good idea to consider implementing one of Marsaglia's really long period generators. His latest, using titanic primes, has a period of 2^131086. It is extremely fast. Even more could be done, by replacing the two phase normal generator by faster generator using a single phase. Liaw, Andy wrote: Might I suggest taking a poll (even though unscientific) of how many people will be affected by a change in default RNG? My totally arbitrary guess is very few, if any. If I'm not mistaken, Python had only recently changed the default RNG to Mersenne-Twister. If Python can do it, I should think R can, too, without too much pain... Just my $0.02... Andy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, January 28, 2003 3:53 PM To: Charles Annis, P.E. Cc: [EMAIL PROTECTED] Subject: RE: [R] random number generator? Can I suggest RNGkind(Mersenne-Twister, Inversion) and especially the use of Inversion where tail behaviour of the normal is important. Were it not for concerns about reproducibility we would have switched to Inversion a while back. On Tue, 28 Jan 2003, Charles Annis, P.E. wrote: Earlier today I reported finding an unbalanced number of observations in the p=0.0001 tails of rnorm. Many thanks to Peter Dalgaard who suggested changing the normal.kind generator. Using RNGkind(kind = NULL, normal.kind =Box-Muller) seems to have provided the remedy. For example: observed.fraction.below [1] 0.000103 observed.fraction.above [1] 0.000101 Thank you, Peter! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFAX: 503-217-5849 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Peter Dalgaard BSA Sent: Tuesday, January 28, 2003 2:36 PM To: Charles Annis, P.E. Cc: [EMAIL PROTECTED] Subject: Re: [R] random number generator? Charles Annis, P.E. [EMAIL PROTECTED] writes: Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. You stepped in something, I think, but I probably shouldn't elaborate on the metaphor ... There's an unfortunate interaction between the two methods that are used for generating uniform and normal variables (the latter uses the former). This has been reported a couple of times before and typically gives anomalous tail behaviour. Changing one of the generators (see help(RNGkind)) usually helps. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help -- __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Bob Wheeler --- http://www.bobwheeler.com/ ECHIP, Inc. --- (302) 239-6620, voice FAX 724 Yorklyn Rd., Hockessin, DE 19707 Randomness comes in bunches
RE: [R] random number generator?
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] AL == Andy Liaw Liaw writes: AL Might I suggest taking a poll (even though unscientific) of how many people AL will be affected by a change in default RNG? My totally arbitrary guess is AL very few, if any. But they may tend to be important (i.e. those with large stat software projects/tools) doing heavy regression testing. But if those who want to use better RNG in R can now put RNGkind() in their .Rprofile, surely those who really need the old generator can do similar, if the default is changed? As Prof. Dalgaard said (privately), I don't think is that painful if the current default generator is kept as an option in RNGkind(). (I.e., all I'm asking is to change the default RNG, not getting rid of the current default.) Cheers, Andy AL If I'm not mistaken, Python had only recently changed the default RNG to AL Mersenne-Twister. If Python can do it, I should think R can, too, without AL too much pain... Depends on which L_p norm you measure pain by. best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics [EMAIL PROTECTED] FHCRC/SCHARP/HIV Vaccine Trials Net [EMAIL PROTECTED] -- http://software.biostat.washington.edu/ FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX (my tuesday/wednesday/friday locations are completely unpredictable.) -- __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] random number generator?
With respect to the method used to produce normals, Professor Ripley noted: Were it not for concerns about reproducibility we would have switched to Inversion a while back. Presumably this refers to reproducibility across platforms. I searched the archive and could find nothing on this point. Gentle's text on RNGs provided no answer, either. Might Professor Ripley or someone else briefly explain why the inversion method is not reproducible but Box-Muller is, or perhaps offer a citation? Thanks, Bruce __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] random number generator?
On Wed, 29 Jan 2003, Paul Gilbert wrote: If reworking the RNG mechanism is considered seriously (and I am not advocating that), I suggest: 1/ There should be a simple mechanism for keeping track of and resetting all the information to generate random numbers, that is, seed, uniform generator, and transformations. (I have a package, which I intend to distribute shortly, that does this for normal distributions and might form a basis for this mechanism. It was previously part of my syskern package in dse, and so the mechanism has been fairly well tested over several years.) That's what RNGkind and set.seed do, and have done for a long time. The information is also stored in .Random.seed, but few users would record that (and it does not exist until the first RNG is used). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] random number generator?
[EMAIL PROTECTED] writes: That's what RNGkind and set.seed do, and have done for a long time. The information is also stored in .Random.seed, but few users would record that (and it does not exist until the first RNG is used). .. but if you don't set the seed, you shouldn't expect reproducible behaviour anyway: [pd@rasch pd]$ echo rnorm(2) | R --vanilla --slave [1] 2.0281025 0.8735026 [pd@rasch pd]$ echo rnorm(2) | R --vanilla --slave [1] -1.2695122 -0.0448524 I mean, if the prototypical answer to people complaining I'm not getting the same results is simply to ask them to put an RNGkind(Marsaglia-Multicarry,Kinderman-Ramage) next to their first set.seed(), how troublesome could it be? -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] random number generator?
Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFAX: 503-217-5849 http://www.StatisticalEngineering.com __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] random number generator?
Charles Annis, P.E. [EMAIL PROTECTED] writes: Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. You stepped in something, I think, but I probably shouldn't elaborate on the metaphor ... There's an unfortunate interaction between the two methods that are used for generating uniform and normal variables (the latter uses the former). This has been reported a couple of times before and typically gives anomalous tail behaviour. Changing one of the generators (see help(RNGkind)) usually helps. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] random number generator?
Peter - Your message is cryptic. I've just re-read help(RNGkind) in R 1.6.1 (linux Redhat rpm) and it doesn't say anything I can see about an unfortuante interaction between the two methods that are used for generating uniform and normal variables. Are you referring to a thread in R-help that starts with a message from Robin Hankin, Tuesday Nov 26 2002 ? On Jonathan Baron's site, that would be http://finzi.psych.upenn.edu/R/Rhelp02/archive/9058.html but even that thread is very vague. (I don't understand a reference to PR#1664 in Thomas Lumleys's email (archive/9064.html).) Maybe it's high time to put a paragraph describing this issue into the help files for either rnorm() or RNGkind(). - tom blackwell - u michigan medical school - ann arbor - On 28 Jan 2003, Peter Dalgaard BSA wrote: Charles Annis, P.E. [EMAIL PROTECTED] writes: Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. You stepped in something, I think, but I probably shouldn't elaborate on the metaphor ... There's an unfortunate interaction between the two methods that are used for generating uniform and normal variables (the latter uses the former). This has been reported a couple of times before and typically gives anomalous tail behaviour. Changing one of the generators (see help(RNGkind)) usually helps. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] random number generator?
Can I suggest RNGkind(Mersenne-Twister, Inversion) and especially the use of Inversion where tail behaviour of the normal is important. Were it not for concerns about reproducibility we would have switched to Inversion a while back. On Tue, 28 Jan 2003, Charles Annis, P.E. wrote: Earlier today I reported finding an unbalanced number of observations in the p=0.0001 tails of rnorm. Many thanks to Peter Dalgaard who suggested changing the normal.kind generator. Using RNGkind(kind = NULL, normal.kind =Box-Muller) seems to have provided the remedy. For example: observed.fraction.below [1] 0.000103 observed.fraction.above [1] 0.000101 Thank you, Peter! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFAX: 503-217-5849 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Peter Dalgaard BSA Sent: Tuesday, January 28, 2003 2:36 PM To: Charles Annis, P.E. Cc: [EMAIL PROTECTED] Subject: Re: [R] random number generator? Charles Annis, P.E. [EMAIL PROTECTED] writes: Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. You stepped in something, I think, but I probably shouldn't elaborate on the metaphor ... There's an unfortunate interaction between the two methods that are used for generating uniform and normal variables (the latter uses the former). This has been reported a couple of times before and typically gives anomalous tail behaviour. Changing one of the generators (see help(RNGkind)) usually helps. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] random number generator?
Might I suggest taking a poll (even though unscientific) of how many people will be affected by a change in default RNG? My totally arbitrary guess is very few, if any. If I'm not mistaken, Python had only recently changed the default RNG to Mersenne-Twister. If Python can do it, I should think R can, too, without too much pain... Just my $0.02... Andy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, January 28, 2003 3:53 PM To: Charles Annis, P.E. Cc: [EMAIL PROTECTED] Subject: RE: [R] random number generator? Can I suggest RNGkind(Mersenne-Twister, Inversion) and especially the use of Inversion where tail behaviour of the normal is important. Were it not for concerns about reproducibility we would have switched to Inversion a while back. On Tue, 28 Jan 2003, Charles Annis, P.E. wrote: Earlier today I reported finding an unbalanced number of observations in the p=0.0001 tails of rnorm. Many thanks to Peter Dalgaard who suggested changing the normal.kind generator. Using RNGkind(kind = NULL, normal.kind =Box-Muller) seems to have provided the remedy. For example: observed.fraction.below [1] 0.000103 observed.fraction.above [1] 0.000101 Thank you, Peter! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFAX: 503-217-5849 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Peter Dalgaard BSA Sent: Tuesday, January 28, 2003 2:36 PM To: Charles Annis, P.E. Cc: [EMAIL PROTECTED] Subject: Re: [R] random number generator? Charles Annis, P.E. [EMAIL PROTECTED] writes: Dear R-Aficionados: I realize that no random number generator is perfect, so what I report below may be a result of that simple fact. However, if I have made an error in my thinking I would greatly appreciate being corrected. I wish to illustrate the behavior of small samples (n=10) and so generate 100,000 of them. n.samples - 100 sample.size = 10 p - 0.0001 z.normal - qnorm(p) # generate n.samples of sample.size each from a normal(mean=0, sd=1) density # small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1), nrow=n.samples, ncol=sample.size) # Verify that from the entire small.sample matrix, p sampled values are below, p above. # observed.fraction.below - sum(small.sample z.normal)/length(small.sample) observed.fraction.above - sum(small.sample -z.normal)/length(small.sample) observed.fraction.below [1] 6.3e-05 observed.fraction.above [1] 0.000142 I've checked the behavior of the entire sample's mean and median and they seem fine. The total fraction in both tails is 0.0002, as it should be. However in every instance about 1/3 are in the lower tail, 2/3 in the upper. I also observe the same 1/3:2/3 ratio for one million samples of ten. Is this simply because random number generators aren't perfect? Or have I stepped in something? Thank you for your kind counsel. You stepped in something, I think, but I probably shouldn't elaborate on the metaphor ... There's an unfortunate interaction between the two methods that are used for generating uniform and normal variables (the latter uses the former). This has been reported a couple of times before and typically gives anomalous tail behaviour. Changing one of the generators (see help(RNGkind)) usually helps. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help -- __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Random number generator in R compared to S
I'm doing some simulations for which i need to use both S-plus and R. I generate in S+ some random normal distributions to define one dataset by iteration. I need to use the same dataset generated in S-plus in R. I was first thinking to generate in R the same dataset by using the same random number generator with a fixed seed. But It seems that S-plus and R don't use the same random number generator. So my question is: Is there a way to pass dynamically dataset generated in S-plus to R, run a command, extract results, and pass them back to S-plus? Or is it more efficient to generate all the datasets in S-plus and export them to R? Or anything else? Any hints will be greatly appreciated. O__ Alain Le Tertre c/ /'_ --- Institut de Veille Sanitaire (InVS) (*) \(*) -- 12 rue du val d'Osne ~~ - 94415 Saint Maurice cedex FRANCE Voice: 33 1 41 79 67 50 Fax: 33 1 41 79 67 68 email: [EMAIL PROTECTED] [[alternate HTML version deleted]] __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help