On Sat, Jul 17, 2010 at 18:01, Steven D'Aprano <st...@pearwood.info> wrote: > Do you care about speed? If this is a script that just needs to run > once, it seems to me that the simplest, easiest to read solution is: > > import random > def random_digit(): > return "0123456789"[random.randrange(10)] > > f = open('rand_digits.txt', 'w') > for i in xrange(10**9): > f.write(random_digit()) > > f.close() > > > This is, of course, horribly inefficient -- it generates digits one at a > time, and worse, it writes them one at a time. I got bored waiting for > it to finish after 20 minutes (at which time it was about 10% of the > way through), but you could let it run in the background for as long as > it takes. > > If speed does matter, the first improvement is to generate larger > streams of random digits at once. An even bigger improvement is to cut > down on the number of disk-writes -- hard drives are a thousand times > slower than RAM, so the more often you write to the disk, the worse off > you are. > > > import random > def random_digits(n): > "Return n random digits with one call to random." > return "%0*d" % (n, random.randrange(10**n)) > > f = open('rand_digits.txt', 'w') > for i in xrange(1000): > buffer = [random_digits(10) for j in xrange(100000)] > f.write(''.join(buffer)) > > f.close() > > On my not-even-close-to-high-end PC, this generates one billion digits > in 22 minutes: > > [st...@sylar python]$ time python randdigits.py > > real 22m31.205s > user 20m18.546s > sys 0m7.675s > [st...@sylar python]$ ls -l rand_digits.txt > -rw-rw-r-- 1 steve steve 1000000000 2010-07-18 11:00 rand_digits.txt
My <http://tutoree7.pastebin.com/9BMYZ08z> took 218 secs. > Having generated the digits, it might be useful to look for deviations > from randomness. There should be approximately equal numbers of each > digit (100,000,000 each of 0, 1, 2, ..., 9), of each digraph > (10,000,000 each of 00, 01, 02, ..., 98, 99), trigraphs (1,000,000 each > of 000, ..., 999) and so forth. Yes. I'll do some checking. Thanks for the tips. > > The interesting question is, if you measure a deviation from the > equality (and you will), is it statistically significant? If so, it is > because of a problem with the random number generator, or with my > algorithm for generating the sample digits? Ah. Can't wait to see what turns up. Thanks, Steve. Dick _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor