Hi Duncan, Thanks for that. In the light of what you've suggested, I'm now using the following:
# generate a random integer from 0 to t (inclusive) if (t < 10000000) { # to avoid memory problems... M <- sample(t, 1) } else { while (M > t) { M <- as.integer(urand(1,min=0, max=t+1-.Machine$double.eps)) } } cheers and Thanks, Sean On 18/09/06, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > On 9/18/2006 3:37 AM, Sean O'Riordain wrote: > > Good morning, > > > > I'm trying to concisely generate a single integer from 0 to n > > inclusive, where n might be of the order of hundreds of millions. > > This will however be used many times during the general procedure, so > > it must be "reasonably efficient" in both memory and time... (at some > > later stage in the development I hope to go vectorized) > > > > The examples I've found through searching RSiteSearch() relating to > > generating random integers say to use : sample(0:n, 1) > > However, when n is "large" this first generates a large sequence 0:n > > before taking a sample of one... this computer doesn't have the memory > > for that! > > You don't need to give the whole vector: just give n, and you'll get > draws from 1:n. The man page is clear on this. > > So what you want is sample(n+1, 1) - 1. (Use "replace=TRUE" if you want > a sample bigger than 1, or you'll get sampling without replacement.) > > > > When I look at the documentation for runif(n, min, max) it states that > > the generated numbers will be min <= x <= max. Note the "<= max"... > > Actually it says that's the range for the uniform density. It's silent > on the range of the output. But it's good defensive programming to > assume that it's possible to get the endpoints. > > > > > How do I generate an x such that the probability of being (the > > integer) max is the same as any other integer from min (an integer) to > > max-1 (an integer) inclusive... My attempt is: > > > > urand.int <- function(n,t) { > > as.integer(runif(n,min=0, max=t+1-.Machine$double.eps)) > > } > > # where I've included the parameter n to help testing... > > Because of rounding error, t+1-.Machine$double.eps might be exactly > equal to t+1. I'd suggest using a rejection method if you need to use > this approach: but sample() is better in the cases where as.integer() > will work. > > Duncan Murdoch > > > > is floor() "better" than as.integer? > > > > Is this correct? Is the probability of the integer t the same as the > > integer 1 or 0 etc... I have done some rudimentary testing and this > > appears to work, but power being what it is, I can't see how to > > realistically test this hypothesis. > > > > Or is there a a better way of doing this? > > > > I'm trying to implement an algorithm which samples into an array, > > hence the need for an integer - and yes I know about sample() thanks! > > :-) > > > > { incidentally, I was surprised to note that the maximum value > > returned by summary(integer_vector) is "pretty" and appears to be > > rounded up to a "nice round number", and is not necessarily the same > > as max(integer_vector) where the value is large, i.e. of the order of > > say 50 million } > > > > Is version etc relevant? (I'll want to be portable) > >> version _ > > platform i386-pc-mingw32 > > arch i386 > > os mingw32 > > system i386, mingw32 > > status > > major 2 > > minor 3.1 > > year 2006 > > month 06 > > day 01 > > svn rev 38247 > > language R > > version.string Version 2.3.1 (2006-06-01) > > > > Many thanks in advance for your help. > > Sean O'Riordain > > affiliation <- NULL > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.