[R] random number generator in batch jobs

2007-07-30 Thread Jiqiu Cheng
Dear sir,
   I want to submit R batch jobs (e.g. 5) under the linux cluster by  
the script file do_mul.
The script file do_mul

#!/bin/bash
export var
for var in $(seq 1 5)
do
   qsub -v var do_test
done
exit 0

Through do_mul, 5 do_test script files are submitted to the cluster.
The script file do_test:

#!/bin/bash -l
#PBS -l ncpus=1
#PBS -l walltime=0:05:00
cd $PBS_O_WORKDIR
mkdir test$var
cd test$var
module load R/2.5.0
R --vanilla test
exit 0

The content in R file test is :
rm(list=ls(all=TRUE))
sample(10)

I expect to have different samples each time. However, for these 5  
replications, the first 3 jobs giving me the same samples and the last  
2 are the same. I'm confused because I already used R --vanilla to  
avoid loading same workspace each time and rm(list=ls(all=TRUE)) to  
remove the same random seed each time. Why do same samples still  
happen among 5 replications? Does anybody have some ideas to solve  
this problem? Looking forward to your reply, thanks.

Regards,
Jiqiu

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random number generator in batch jobs

2007-07-30 Thread Prof Brian Ripley
Have you read the help page?

  Initially, there is no seed;  a new one is created from the
  current time when one is required.  Hence, different sessions will
  give different simulation results, by default.

Thus if you choose to launch processes on different machines at the 
same time you will get the same random number stream.

Running random number streams for parallel computation is a (very) 
specialized topic and you need to be aware of the literature.  I will 
point out packages rsprng and accuracy (function runifS).

On Mon, 30 Jul 2007, Jiqiu Cheng wrote:

 Dear sir,
   I want to submit R batch jobs (e.g. 5) under the linux cluster by
 the script file do_mul.
 The script file do_mul
 
 #!/bin/bash
 export var
 for var in $(seq 1 5)
 do
   qsub -v var do_test
 done
 exit 0
 
 Through do_mul, 5 do_test script files are submitted to the cluster.
 The script file do_test:
 
 #!/bin/bash -l
 #PBS -l ncpus=1
 #PBS -l walltime=0:05:00
 cd $PBS_O_WORKDIR
 mkdir test$var
 cd test$var
 module load R/2.5.0
 R --vanilla test
 exit 0
 
 The content in R file test is :
 rm(list=ls(all=TRUE))
 sample(10)
 
 I expect to have different samples each time. However, for these 5
 replications, the first 3 jobs giving me the same samples and the last
 2 are the same. I'm confused because I already used R --vanilla to
 avoid loading same workspace each time and rm(list=ls(all=TRUE)) to
 remove the same random seed each time. Why do same samples still
 happen among 5 replications? Does anybody have some ideas to solve
 this problem? Looking forward to your reply, thanks.

 Regards,
 Jiqiu

 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Number Generator of Park and Miller

2007-04-25 Thread David Forrest
On Tue, 24 Apr 2007, gracezhang wrote:


 Hi,

 I failed to search for R package providing random number generator of Park
 and Miller.
 Anyone know any R package supporting this kind of function?

rng.lcg-function(x,p1=16807,p2=0,N=2147483647){(x*p1+p2)%%N}

Dave
-- 
  Dr. David Forrest
  [EMAIL PROTECTED](804)684-7900w
  [EMAIL PROTECTED] (804)642-0662h
http://maplepark.com/~drf5n/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Number Generator of Park and Miller

2007-04-24 Thread gracezhang

Hi,

I failed to search for R package providing random number generator of Park
and Miller. 
Anyone know any R package supporting this kind of function?

Thanks,
Grace
-- 
View this message in context: 
http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3642535.html#a10172789
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Number Generator of Park and Miller

2007-04-23 Thread gracezhang

Hi,

I failed to search for R package providing random number generator of Park
and Miller. 
Anyone know any R package supporting this kind of function?

Thanks,
Grace
-- 
View this message in context: 
http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636501.html#a10154554
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Number Generator of Park and Miller

2007-04-23 Thread gracezhang

Hi,

I failed to search for R package providing random number generator of Park
and Miller. 
Anyone know any R package supporting this kind of function?

Thanks,
Grace
-- 
View this message in context: 
http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154558
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Number Generator of Park and Miller

2007-04-23 Thread Vladimir Eremeev


gracezhang wrote:
 
 Hi,
 
 I failed to search for R package providing random number generator of
 Park and Miller. 
 Anyone know any R package supporting this kind of function?
 

I failed too.
However, here is the source code http://www.firstpr.com.au/dsp/rand31/
which can be either easily rewritten to R or compiled to the shared library,
providing functions, callable from R

-- 
View this message in context: 
http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154560
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Number Generator of Park and Miller

2007-04-23 Thread Vladimir Eremeev

By the way, AFAIK, R uses the Mersenne-Twister random number generator, which
has a much better reputation for producing numbers than any linear
congruential PRNG (the same url, http://www.firstpr.com.au/dsp/rand31/)


gracezhang wrote:
 
 I failed to search for R package providing random number generator of
 Park and Miller. 
 Anyone know any R package supporting this kind of function?
 

-- 
View this message in context: 
http://www.nabble.com/Random-Number-Generator-of-Park-and-Miller-tf3636502.html#a10154562
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random number generator in R compared

2007-04-20 Thread raymond chiruka
I am trying to generate survival data using R .Im trying to randomly  generate 
a  column of 1s and 0 and another column randomly  generated using an 
exponential distribution but l cant seem to get the  random function. how do l 
go about it 
   
  thanks in advance
  
  rt chiruka
  
  
   
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random number generator in R compared

2007-04-20 Thread Daniel Nordlund
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of raymond chiruka
 Sent: Thursday, April 19, 2007 5:23 AM
 To: R-help@stat.math.ethz.ch
 Subject: Re: [R] Random number generator in R compared
 
 I am trying to generate survival data using R .Im trying to randomly  
 generate a
 column of 1s and 0 and another column randomly  generated using an exponential
 distribution but l cant seem to get the  random function. how do l go about it
 
   thanks in advance
 
   rt chiruka
 

To get your random column of 0's and 1's try
 ?sample.

  y - sample(c(0,1), size=100, replace=TRUE)

For your random sample from an exponential distribution try
?rexp

 x - rexp(100)

Hope this is helpful,

Dan

Dan Nordlund
Bothell, WA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random number generator: same seed used in different sessions

2005-06-15 Thread Jason Liao
I did several simulation sessions and the result turned out to be a
surprise. After some investigation, I found that different R sessions
of the program used the same seed. Simply, in R210, if I start R and
type rnorm(1), I always get the same random number. This is
contradictary to what is in the R document

 Initially, there is no seed;  a new one is created from the
 current time when one is required.  Hence, different sessions will
 give different simulation results, by default.

I just installed the development version R220. Different sessions of R
do use different seeds as expected.





Jason Liao, http://www.geocities.com/jg_liao
Dept. of Biostatistics, http://www2.umdnj.edu/bmtrxweb
University of Medicine and Dentistry of New Jersey
683 Hoes Lane West, Piscataway‚ NJ 08854
phone 732-235-5429, School of Public Health office
phone 732-235-9824, Cancer Institute of New Jersey office

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random number generator: same seed used in different sessions

2005-06-15 Thread Prof Brian Ripley
On Wed, 15 Jun 2005, Jason Liao wrote:

 I did several simulation sessions and the result turned out to be a
 surprise. After some investigation, I found that different R sessions
 of the program used the same seed. Simply, in R210, if I start R and
 type rnorm(1), I always get the same random number. This is
 contradictary to what is in the R document

 Initially, there is no seed;  a new one is created from the
 current time when one is required.  Hence, different sessions will
 give different simulation results, by default.

That is not a contradiction!  Did you ensure that this really was 
`initially', so there is no saved workspace, no .Rprofile etc?
See if starting with R --vanilla makes a difference.

 I just installed the development version R220. Different sessions of R
 do use different seeds as expected.

There are no R versions R210 or R220: please do read the posting guide
as we ask:

 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random number generator: same seed used in different sessions

2005-06-15 Thread Jason Liao
Thank you, Prof. Ripley. Yes, it turned out that different R sessions
(batch mode) loaded the same work space. Inserting
rm(list=ls(all=TRUE))
on top of the code solves the problem.


--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Wed, 15 Jun 2005, Jason Liao wrote:
 
  I did several simulation sessions and the result turned out to be a
  surprise. After some investigation, I found that different R
 sessions
  of the program used the same seed. Simply, in R210, if I start R
 and
  type rnorm(1), I always get the same random number. This is
  contradictary to what is in the R document
 
  Initially, there is no seed;  a new one is created from the
  current time when one is required.  Hence, different sessions
 will
  give different simulation results, by default.
 
 That is not a contradiction!  Did you ensure that this really was 
 `initially', so there is no saved workspace, no .Rprofile etc?
 See if starting with R --vanilla makes a difference.
 
  I just installed the development version R220. Different sessions
 of R
  do use different seeds as expected.
 
 There are no R versions R210 or R220: please do read the posting
 guide
 as we ask:
 
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 


Jason Liao, http://www.geocities.com/jg_liao
Dept. of Biostatistics, http://www2.umdnj.edu/bmtrxweb
University of Medicine and Dentistry of New Jersey
683 Hoes Lane West, Piscataway‚ NJ 08854
phone 732-235-5429, School of Public Health office
phone 732-235-9824, Cancer Institute of New Jersey office

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random number generator?

2003-01-29 Thread Bob Wheeler
Well if the base generator is to be changed, it might be a
good idea to consider implementing one of Marsaglia's really
long period generators. His latest, using titanic primes,
has a period of 2^131086. It is extremely fast. Even more
could be done, by replacing the two phase normal generator
by faster generator using a single phase. 

Liaw, Andy wrote:
 
 Might I suggest taking a poll (even though unscientific) of how many people
 will be affected by a change in default RNG?  My totally arbitrary guess is
 very few, if any.
 
 If I'm not mistaken, Python had only recently changed the default RNG to
 Mersenne-Twister.  If Python can do it, I should think R can, too, without
 too much pain...
 
 Just my $0.02...
 
 Andy
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Tuesday, January 28, 2003 3:53 PM
  To: Charles Annis, P.E.
  Cc: [EMAIL PROTECTED]
  Subject: RE: [R] random number generator?
 
 
  Can I suggest
 
  RNGkind(Mersenne-Twister, Inversion)
 
  and especially the use of Inversion where tail behaviour of
  the normal is
  important.
 
  Were it not for concerns about reproducibility we would have
  switched to
  Inversion a while back.
 
  On Tue, 28 Jan 2003, Charles Annis, P.E. wrote:
 
  
   Earlier today I reported finding an unbalanced number of
  observations in
   the p=0.0001 tails of rnorm.
  
   Many thanks to Peter Dalgaard who suggested changing the normal.kind
   generator.
  
   Using  RNGkind(kind = NULL, normal.kind =Box-Muller)
   seems to have provided the remedy.  For example:
  
observed.fraction.below
   [1] 0.000103
observed.fraction.above
   [1] 0.000101
   
  
   Thank you, Peter!
  
  
   Charles Annis, P.E.
  
   [EMAIL PROTECTED]
   phone: 561-352-9699
   eFAX: 503-217-5849
   http://www.StatisticalEngineering.com
  
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED]] On Behalf Of Peter
  Dalgaard BSA
   Sent: Tuesday, January 28, 2003 2:36 PM
   To: Charles Annis, P.E.
   Cc: [EMAIL PROTECTED]
   Subject: Re: [R] random number generator?
  
   Charles Annis, P.E. [EMAIL PROTECTED] writes:
  
Dear R-Aficionados:
   
I realize that no random number generator is perfect, so
  what I report
below may be a result of that simple fact.  However, if I
  have made an
error in my thinking I would greatly appreciate being corrected.
   
I wish to illustrate the behavior of small samples (n=10) and so
generate 100,000 of them.
   
n.samples - 100
sample.size = 10
p - 0.0001
z.normal - qnorm(p)
# generate n.samples of sample.size each from a
  normal(mean=0, sd=1)
density
#
small.sample - matrix(rnorm(n=sample.size*n.samples,
  mean=0, sd=1),
nrow=n.samples, ncol=sample.size)
# Verify that from the entire small.sample matrix, p
  sampled values
   are
below, p above.
#
observed.fraction.below - sum(small.sample 
z.normal)/length(small.sample)
observed.fraction.above - sum(small.sample 
-z.normal)/length(small.sample)
   
 observed.fraction.below
[1] 6.3e-05
 observed.fraction.above
[1] 0.000142

   
I've checked the behavior of the entire sample's mean and
  median and
they seem fine.  The total fraction in both tails is 0.0002, as it
should be.  However in every instance about 1/3 are in
  the lower tail,
2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one
   million
samples of ten.
   
Is this simply because random number generators aren't
  perfect?  Or
   have
I stepped in something?
   
Thank you for your kind counsel.
  
   You stepped in something, I think, but I probably shouldn't
  elaborate
   on the metaphor ... There's an unfortunate interaction
  between the two
   methods that are used for generating uniform and normal
  variables (the
   latter uses the former). This has been reported a couple of times
   before and typically gives anomalous tail behaviour. Changing one of
   the generators (see help(RNGkind)) usually helps.
  
  
 
  --
  Brian D. Ripley,  [EMAIL PROTECTED]
  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
  University of Oxford, Tel:  +44 1865 272861 (self)
  1 South Parks Road, +44 1865 272866 (PA)
  Oxford OX1 3TG, UKFax:  +44 1865 272595
 
  __
  [EMAIL PROTECTED] mailing list
  http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
 
 --
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Bob Wheeler --- http://www.bobwheeler.com/
ECHIP, Inc. --- (302) 239-6620, voice FAX
   724 Yorklyn Rd., Hockessin, DE 19707
  Randomness comes in bunches

RE: [R] random number generator?

2003-01-29 Thread Liaw, Andy
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 
  AL == Andy Liaw Liaw writes:
 
 AL Might I suggest taking a poll (even though 
 unscientific) of how many people
 AL will be affected by a change in default RNG?  My 
 totally arbitrary guess is
 AL very few, if any.
 
 But they may tend to be important (i.e. those with large stat software
 projects/tools) doing heavy regression testing.  
 

But if those who want to use better RNG in R can now put RNGkind() in their
.Rprofile,  surely those who really need the old generator can do similar,
if the default is changed?  As Prof. Dalgaard said (privately), I don't
think is that painful if the current default generator is kept as an option
in RNGkind().  (I.e., all I'm asking is to change the default RNG, not
getting rid of the current default.)

Cheers,
Andy

 AL If I'm not mistaken, Python had only recently changed 
 the default RNG to
 AL Mersenne-Twister.  If Python can do it, I should 
 think R can, too, without
 AL too much pain...
 
 Depends on which L_p norm you measure pain by.
 
 best,
 -tony
 
 -- 
 A.J. Rossini  Rsrch. Asst. Prof. of 
 Biostatistics
 U. of Washington Biostatistics
 [EMAIL PROTECTED]  
 FHCRC/SCHARP/HIV Vaccine Trials Net   [EMAIL PROTECTED]
 -- http://software.biostat.washington.edu/ 
 
 FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty 
 sketchy/use Email
 UW:   Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
 (my tuesday/wednesday/friday locations are completely unpredictable.)
 


--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] random number generator?

2003-01-29 Thread Bruce D McCullough
With respect to the method used to produce
normals, Professor Ripley noted:


Were it not for concerns about reproducibility we would have switched to 
Inversion a while back.

Presumably this refers to reproducibility across platforms.  

I searched the archive and could find nothing on this point.
Gentle's text on RNGs provided no answer, either.

Might Professor Ripley or someone else briefly explain why the
inversion method is not reproducible but Box-Muller is, or 
perhaps offer a citation?

Thanks,

Bruce

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] random number generator?

2003-01-29 Thread ripley
On Wed, 29 Jan 2003, Paul Gilbert wrote:

 If reworking the RNG mechanism is considered seriously (and I am not advocating
 that), I suggest:
 
 1/ There should be a simple mechanism for keeping track of and resetting all the
 information to generate random numbers, that is, seed, uniform generator, and
 transformations. (I have a package, which I intend to distribute shortly, that
 does this for normal distributions and might form a basis for this mechanism. It
 was previously part of my syskern package in dse, and so the mechanism has been
 fairly well tested over several years.)

That's what RNGkind and set.seed do, and have done for a long time.
The information is also stored in .Random.seed, but few users would record 
that (and it does not exist until the first RNG is used).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] random number generator?

2003-01-29 Thread Peter Dalgaard BSA
[EMAIL PROTECTED] writes:

 That's what RNGkind and set.seed do, and have done for a long time.
 The information is also stored in .Random.seed, but few users would record 
 that (and it does not exist until the first RNG is used).

.. but if you don't set the seed, you shouldn't expect reproducible
behaviour anyway:

[pd@rasch pd]$ echo rnorm(2) | R --vanilla --slave
[1] 2.0281025 0.8735026

[pd@rasch pd]$ echo rnorm(2) | R --vanilla --slave
[1] -1.2695122 -0.0448524

I mean, if the prototypical answer to people complaining I'm not
getting the same results is simply to ask them to put an 
RNGkind(Marsaglia-Multicarry,Kinderman-Ramage) next to their first
set.seed(), how troublesome could it be?
-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] random number generator?

2003-01-28 Thread Charles Annis, P.E.
Dear R-Aficionados:

I realize that no random number generator is perfect, so what I report
below may be a result of that simple fact.  However, if I have made an
error in my thinking I would greatly appreciate being corrected.

I wish to illustrate the behavior of small samples (n=10) and so
generate 100,000 of them.

n.samples - 100
sample.size = 10
p - 0.0001
z.normal - qnorm(p)
# generate n.samples of sample.size each from a normal(mean=0, sd=1)
density
#
small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
nrow=n.samples, ncol=sample.size)
# Verify that from the entire small.sample matrix, p sampled values are
below, p above.
#
observed.fraction.below - sum(small.sample 
z.normal)/length(small.sample)
observed.fraction.above - sum(small.sample 
-z.normal)/length(small.sample)

 observed.fraction.below 
[1] 6.3e-05
 observed.fraction.above 
[1] 0.000142
 

I've checked the behavior of the entire sample's mean and median and
they seem fine.  The total fraction in both tails is 0.0002, as it
should be.  However in every instance about 1/3 are in the lower tail,
2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one million
samples of ten.

Is this simply because random number generators aren't perfect?  Or have
I stepped in something?

Thank you for your kind counsel.


Charles Annis, P.E.

[EMAIL PROTECTED]
phone: 561-352-9699
eFAX: 503-217-5849
http://www.StatisticalEngineering.com

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] random number generator?

2003-01-28 Thread Peter Dalgaard BSA
Charles Annis, P.E. [EMAIL PROTECTED] writes:

 Dear R-Aficionados:
 
 I realize that no random number generator is perfect, so what I report
 below may be a result of that simple fact.  However, if I have made an
 error in my thinking I would greatly appreciate being corrected.
 
 I wish to illustrate the behavior of small samples (n=10) and so
 generate 100,000 of them.
 
 n.samples - 100
 sample.size = 10
 p - 0.0001
 z.normal - qnorm(p)
 # generate n.samples of sample.size each from a normal(mean=0, sd=1)
 density
 #
 small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
 nrow=n.samples, ncol=sample.size)
 # Verify that from the entire small.sample matrix, p sampled values are
 below, p above.
 #
 observed.fraction.below - sum(small.sample 
 z.normal)/length(small.sample)
 observed.fraction.above - sum(small.sample 
 -z.normal)/length(small.sample)
 
  observed.fraction.below 
 [1] 6.3e-05
  observed.fraction.above 
 [1] 0.000142
  
 
 I've checked the behavior of the entire sample's mean and median and
 they seem fine.  The total fraction in both tails is 0.0002, as it
 should be.  However in every instance about 1/3 are in the lower tail,
 2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one million
 samples of ten.
 
 Is this simply because random number generators aren't perfect?  Or have
 I stepped in something?
 
 Thank you for your kind counsel.

You stepped in something, I think, but I probably shouldn't elaborate
on the metaphor ... There's an unfortunate interaction between the two
methods that are used for generating uniform and normal variables (the
latter uses the former). This has been reported a couple of times
before and typically gives anomalous tail behaviour. Changing one of
the generators (see help(RNGkind)) usually helps.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] random number generator?

2003-01-28 Thread Thomas W Blackwell
Peter  -

Your message is cryptic.  I've just re-read  help(RNGkind)  in
R 1.6.1 (linux Redhat rpm) and it doesn't say anything I can see
about an unfortuante interaction between the two methods that
are used for generating uniform and normal variables.

Are you referring to a thread in R-help that starts with a message
from Robin Hankin, Tuesday Nov 26 2002 ?  On Jonathan Baron's site,
that would be

http://finzi.psych.upenn.edu/R/Rhelp02/archive/9058.html

but even that thread is very vague.  (I don't understand a reference
to PR#1664 in Thomas Lumleys's email (archive/9064.html).)

Maybe it's high time to put a paragraph describing this issue into
the help files for either rnorm() or RNGkind().

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On 28 Jan 2003, Peter Dalgaard BSA wrote:

 Charles Annis, P.E. [EMAIL PROTECTED] writes:

  Dear R-Aficionados:
 
  I realize that no random number generator is perfect, so what I report
  below may be a result of that simple fact.  However, if I have made an
  error in my thinking I would greatly appreciate being corrected.
 
  I wish to illustrate the behavior of small samples (n=10) and so
  generate 100,000 of them.
 
  n.samples - 100
  sample.size = 10
  p - 0.0001
  z.normal - qnorm(p)
  # generate n.samples of sample.size each from a normal(mean=0, sd=1)
  density
  #
  small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
  nrow=n.samples, ncol=sample.size)
  # Verify that from the entire small.sample matrix, p sampled values are
  below, p above.
  #
  observed.fraction.below - sum(small.sample 
  z.normal)/length(small.sample)
  observed.fraction.above - sum(small.sample 
  -z.normal)/length(small.sample)
 
   observed.fraction.below
  [1] 6.3e-05
   observed.fraction.above
  [1] 0.000142
  
 
  I've checked the behavior of the entire sample's mean and median and
  they seem fine.  The total fraction in both tails is 0.0002, as it
  should be.  However in every instance about 1/3 are in the lower tail,
  2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one million
  samples of ten.
 
  Is this simply because random number generators aren't perfect?  Or have
  I stepped in something?
 
  Thank you for your kind counsel.

 You stepped in something, I think, but I probably shouldn't elaborate
 on the metaphor ... There's an unfortunate interaction between the two
 methods that are used for generating uniform and normal variables (the
 latter uses the former). This has been reported a couple of times
 before and typically gives anomalous tail behaviour. Changing one of
 the generators (see help(RNGkind)) usually helps.

 --
O__   Peter Dalgaard Blegdamsvej 3
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
  (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] random number generator?

2003-01-28 Thread ripley
Can I suggest

RNGkind(Mersenne-Twister, Inversion)

and especially the use of Inversion where tail behaviour of the normal is
important.

Were it not for concerns about reproducibility we would have switched to 
Inversion a while back.

On Tue, 28 Jan 2003, Charles Annis, P.E. wrote:

 
 Earlier today I reported finding an unbalanced number of observations in
 the p=0.0001 tails of rnorm.
 
 Many thanks to Peter Dalgaard who suggested changing the normal.kind
 generator.  
 
 Using  RNGkind(kind = NULL, normal.kind =Box-Muller)
 seems to have provided the remedy.  For example:
 
  observed.fraction.below 
 [1] 0.000103
  observed.fraction.above 
 [1] 0.000101
   
 
 Thank you, Peter!
 
 
 Charles Annis, P.E.
 
 [EMAIL PROTECTED]
 phone: 561-352-9699
 eFAX: 503-217-5849
 http://www.StatisticalEngineering.com
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]] On Behalf Of Peter Dalgaard BSA
 Sent: Tuesday, January 28, 2003 2:36 PM
 To: Charles Annis, P.E.
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] random number generator?
 
 Charles Annis, P.E. [EMAIL PROTECTED] writes:
 
  Dear R-Aficionados:
  
  I realize that no random number generator is perfect, so what I report
  below may be a result of that simple fact.  However, if I have made an
  error in my thinking I would greatly appreciate being corrected.
  
  I wish to illustrate the behavior of small samples (n=10) and so
  generate 100,000 of them.
  
  n.samples - 100
  sample.size = 10
  p - 0.0001
  z.normal - qnorm(p)
  # generate n.samples of sample.size each from a normal(mean=0, sd=1)
  density
  #
  small.sample - matrix(rnorm(n=sample.size*n.samples, mean=0, sd=1),
  nrow=n.samples, ncol=sample.size)
  # Verify that from the entire small.sample matrix, p sampled values
 are
  below, p above.
  #
  observed.fraction.below - sum(small.sample 
  z.normal)/length(small.sample)
  observed.fraction.above - sum(small.sample 
  -z.normal)/length(small.sample)
  
   observed.fraction.below 
  [1] 6.3e-05
   observed.fraction.above 
  [1] 0.000142
   
  
  I've checked the behavior of the entire sample's mean and median and
  they seem fine.  The total fraction in both tails is 0.0002, as it
  should be.  However in every instance about 1/3 are in the lower tail,
  2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one
 million
  samples of ten.
  
  Is this simply because random number generators aren't perfect?  Or
 have
  I stepped in something?
  
  Thank you for your kind counsel.
 
 You stepped in something, I think, but I probably shouldn't elaborate
 on the metaphor ... There's an unfortunate interaction between the two
 methods that are used for generating uniform and normal variables (the
 latter uses the former). This has been reported a couple of times
 before and typically gives anomalous tail behaviour. Changing one of
 the generators (see help(RNGkind)) usually helps.
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] random number generator?

2003-01-28 Thread Liaw, Andy
Might I suggest taking a poll (even though unscientific) of how many people
will be affected by a change in default RNG?  My totally arbitrary guess is
very few, if any.

If I'm not mistaken, Python had only recently changed the default RNG to
Mersenne-Twister.  If Python can do it, I should think R can, too, without
too much pain...

Just my $0.02...

Andy

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, January 28, 2003 3:53 PM
 To: Charles Annis, P.E.
 Cc: [EMAIL PROTECTED]
 Subject: RE: [R] random number generator?
 
 
 Can I suggest
 
 RNGkind(Mersenne-Twister, Inversion)
 
 and especially the use of Inversion where tail behaviour of 
 the normal is
 important.
 
 Were it not for concerns about reproducibility we would have 
 switched to 
 Inversion a while back.
 
 On Tue, 28 Jan 2003, Charles Annis, P.E. wrote:
 
  
  Earlier today I reported finding an unbalanced number of 
 observations in
  the p=0.0001 tails of rnorm.
  
  Many thanks to Peter Dalgaard who suggested changing the normal.kind
  generator.  
  
  Using  RNGkind(kind = NULL, normal.kind =Box-Muller)
  seems to have provided the remedy.  For example:
  
   observed.fraction.below 
  [1] 0.000103
   observed.fraction.above 
  [1] 0.000101

  
  Thank you, Peter!
  
  
  Charles Annis, P.E.
  
  [EMAIL PROTECTED]
  phone: 561-352-9699
  eFAX: 503-217-5849
  http://www.StatisticalEngineering.com
  
  
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED]] On Behalf Of Peter 
 Dalgaard BSA
  Sent: Tuesday, January 28, 2003 2:36 PM
  To: Charles Annis, P.E.
  Cc: [EMAIL PROTECTED]
  Subject: Re: [R] random number generator?
  
  Charles Annis, P.E. [EMAIL PROTECTED] writes:
  
   Dear R-Aficionados:
   
   I realize that no random number generator is perfect, so 
 what I report
   below may be a result of that simple fact.  However, if I 
 have made an
   error in my thinking I would greatly appreciate being corrected.
   
   I wish to illustrate the behavior of small samples (n=10) and so
   generate 100,000 of them.
   
   n.samples - 100
   sample.size = 10
   p - 0.0001
   z.normal - qnorm(p)
   # generate n.samples of sample.size each from a 
 normal(mean=0, sd=1)
   density
   #
   small.sample - matrix(rnorm(n=sample.size*n.samples, 
 mean=0, sd=1),
   nrow=n.samples, ncol=sample.size)
   # Verify that from the entire small.sample matrix, p 
 sampled values
  are
   below, p above.
   #
   observed.fraction.below - sum(small.sample 
   z.normal)/length(small.sample)
   observed.fraction.above - sum(small.sample 
   -z.normal)/length(small.sample)
   
observed.fraction.below 
   [1] 6.3e-05
observed.fraction.above 
   [1] 0.000142

   
   I've checked the behavior of the entire sample's mean and 
 median and
   they seem fine.  The total fraction in both tails is 0.0002, as it
   should be.  However in every instance about 1/3 are in 
 the lower tail,
   2/3 in the upper.  I also observe the same 1/3:2/3 ratio for one
  million
   samples of ten.
   
   Is this simply because random number generators aren't 
 perfect?  Or
  have
   I stepped in something?
   
   Thank you for your kind counsel.
  
  You stepped in something, I think, but I probably shouldn't 
 elaborate
  on the metaphor ... There's an unfortunate interaction 
 between the two
  methods that are used for generating uniform and normal 
 variables (the
  latter uses the former). This has been reported a couple of times
  before and typically gives anomalous tail behaviour. Changing one of
  the generators (see help(RNGkind)) usually helps.
  
  
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 


--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] Random number generator in R compared to S

2003-01-14 Thread Alain LE TERTRE
I'm doing some simulations for which i need to use both S-plus and R.
I generate in S+ some random normal distributions to define one dataset by
iteration.  I need to use the same dataset generated in S-plus in R.
I was first thinking to generate in R the same dataset by using the same
random number generator with a fixed seed. But It seems that S-plus and R
don't use the same random number generator.
So my question is: Is there a way to pass dynamically dataset generated in
S-plus to R, run a command, extract results, and pass them back to S-plus?
Or is it more efficient to generate all the datasets in S-plus and export
them to R?
Or anything else?

Any hints will be greatly appreciated.
O__  Alain Le Tertre
 c/ /'_ --- Institut de Veille Sanitaire (InVS)
(*) \(*) -- 12 rue du val d'Osne
~~ - 94415 Saint Maurice cedex FRANCE
Voice: 33 1 41 79 67 50 Fax: 33 1 41 79 67 68
email: [EMAIL PROTECTED]


[[alternate HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help