Re: [R] can you tell what .Random.seed was?

Warren Young Fri, 15 May 2009 11:41:32 -0700

G. Jay Kerns wrote:


I want it to be *difficult* for students to figure out the seed and
automatically generate solutions on their own.

Hmmm.... Would it really be a bad thing if someone reverse engineeredthis to generate answers given the problem set? If it's hard enough todo that, it'd be more worth solving than the given problem set. I callthat "extra credit".

a brute force search of set.seed() is really
pretty easy and fast... even for students at this level.

Either you're misunderstanding Stavros' benchmark results, or I am.Could easily be the latter...I'm an R newbie.

As far as I can tell, the inner part of the loop does very little. Ifthat's right, Stavros is saying it will take 18 hours to try everypossible seed when the algorithm based on that seed takes almost no timeto run. But, if generating each problem set takes, say, a minute, itwill take 4.7 million years to generate a complete rainbow table whenthere are 2^32 possible seeds.

what if the Instructor
inserted an *unknown* very large number of calls to the RNG near the
beginning of the .Rnw (but after the set.seed)...  and did not
distribute this information to the students...  that would make it
much harder, yes?


There are better ways.

As above, one key to making rainbow tables impractical is making theper-iteration time long enough. Even if it only takes a second togenerate each possible problem set, that's enough when multiplied byhigh enough powers of 2.


The other key is using big enough powers of 2.

I hadn't looked into R's random number generation before, but it appearsquite robust. Seeding it with the current wall clock time (a 32-bitinteger on most systems) is an insult to its capability.

The default pseudo-random number generator (PRNG) in my copy of R is theMersenne Twister, a truly awesome algorithm. It's capable of very highquality results, as long as you give it a good seed. It will take avector of *many* integers as a seed, not just one. It's not clear to mefrom the R docs if you can pass an arbitrary array of integers with anyvalue, or if it needs something special.

Assuming you can give it any old passel of randomness as a seed, youjust have to find a good source of randomness to create that seed. On aLinux box, you could concatenate several dozen bytes read from/dev/random, the current wall clock time in microseconds, the inode ofthe R script being run, the process ID of the R interpreter, and thecurrent mouse cursor position into a single string. Feed all that intoa hash algorithm, and break off pieces of that 4 bytes long, cast themto integers, and send that array of ints to set.seed().

If you use SHA-256 as the hash algorithm, that scheme should give youenough input randomness to get any of the possible 2^256 hash outputs,making that the amount of possible problem sets. That's more than arainbow table buster...there aren't enough atoms in the visible universeto construct a computer big enough to cope with 2^256 possible outputs.

That said, the quality of the PRNG just *allows* you to avoid screwingup. It doesn't make it impossible make a weak algorithm.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed *was*?

Reply via email to

Re: [R] can you tell what .Random.seed was?