[EMAIL PROTECTED] wrote:
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---559023410-851401618-1218751024=:15885
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


I didn't describe the problem clearly. It's about the number of distinct=20
values. So just ignore cycle issue.

My tests were:

RNGkind(kind=3D"Knuth-TAOCP");
sum(duplicated(runif(1e7))); #return 46552

RNGkind(kind=3D"Knuth-TAOCP-2002");
sum(duplicated(runif(1e7))); #return 46415

#These collision frequency suggested there were 2^30 distinct values by=20
birthday problem.

The birthday problem distribution applies to independent draws, but they are only pseudo-independent. I think the only ways to know for sure if there are 2^30 values are to look at the code, or run through a complete cycle. And you need to determine the cycle by looking at .Random.seed, not at the returned value.

RNGkind(kind=3D"Marsaglia-Multicarry");
sum(duplicated(runif(1e7))); #return 11682

RNGkind(kind=3D"Super-Duper");
sum(duplicated(runif(1e7))); #return 11542

RNGkind(kind=3D"Mersenne-Twister");
sum(duplicated(runif(1e7))); #return 11656

#These indicated there were 2^32 distinct values, which agrees with the=20
help info.


If there are 2^30 distinct values for the two generators above, that also agrees with the documentation.

RNGkind(kind=3D"Wichmann-Hill");
sum(duplicated(runif(1e7))); #return 0

#So for this method, there should be more than 2^32 distinct values.

You may not get the exact numbers, but they should be close. So how to=20
explain above problem?

You haven't demonstrated what you claim, but if you look at the source, you'll see that in fact the man page is wrong: Wichmann-Hill is based on 3 integer values, which each take on approximately 15 bits of different values. So Wichmann-Hill could take nearly 2^45 different values (actually 30269*30307*30323).

The source is in https://svn.r-project.org/R/trunk/src/main/RNG.c if you want to check the others.
I need generate a large sample without any ties, it seems to me=20
"Wichmann-Hill" is only choice right now.

An alternative would be to construct a new value from two (or more) runif() values, but be careful that you don't mess up the distribution when you do that.

Duncan Murdoch
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Shengqiao Li

The Department of Statistics
PO Box 6330
West Virginia University
Morgantown, WV 26506-6330
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

On Thu, 14 Aug 2008, Peter Dalgaard wrote:

Shengqiao Li wrote:
Hello all,
=20
I am generating large samples of random numbers. The RNG help page says:=
=20
"All the supplied uniform generators return 32-bit integer values that a=
re=20
converted to doubles, so they take at most 2^32 distinct values and long=
=20
runs will return duplicated values." But I find that the cycles are not =
the=20
same as the 32-bit integer.
=20
My test indicated that the cycles for Knuth's methods were 2^30 while=20
Wichmann-Hill's cycle was larger than 2^32! No numbers were duplicated i=
n=20
10M numbers generated by runif using Wichmann-Hill. The other three meth=
ods=20
had cycle length of 2^32.
=20
So, anybody can explain this? And any improvement to the implementation =
can=20
be made to increase the cycle length like the Wichmann-Hill method?
=20
What test? These are not simple linear congruential generators. Just beca=
use=20
you get the same value twice, it doesn't mean that the sequence is repeat=
ing.=20
Perhaps you should read the entire help page rather than just the note.

--=20
 O__  ---- Peter Dalgaard             =D8ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])              FAX: (+45) 35327907


---559023410-851401618-1218751024=:15885--

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to