On Saturday, 21 December 2013 at 05:12:57 UTC, Chris Cain wrote:
For more information, I've written a document on an implementation of uniform (which should be coming in 2.065, btw) which discusses the issue with just using the modulus operator:
https://dl.dropboxusercontent.com/u/2206555/uniformUpgrade.pdf

Looks like your new implementation has one modulo operator, compared to the previous one having two divisions. That may be the cause of speedup.

The previous implementation was, by its looks, copied from C++ Boost which also uses two divisions. Do you know the reason for that? They seem to have been solving the exact same problem (strict uniformness provided that the underlying RNG is uniform).

I'd like to touch a relevant point here that matters for me. In a mature randomness library, one important quality is reproducibility: there are applications where you want to use pseudo-random values, but generate the exact same pseudo-random values across different versions, computers, operating systems and language implementations. So far I have seen very few languages which provide such reproducibility guarantees for their standard library. For example, in C and C++ standard randomness library, the details were implementation-dependent all the way until the recent C++11. Python stood for long but finally broke it between 3.1 and 3.2 because of the exact same non-uniformness problem. A positive example in this regard is Java which enforces the implementation of Random since at least version 1.5.

If you break the reproducibility of uniform in dmd 2.065, there should be at least a note on that in its documentation. For a mature library, I think the old implementation should also have been made available somehow. (well, there's always an option to include an old library version in your project, but...) Perhaps that's not the case for D and Phobos since they are still not stabilized. Especially so for std.random which is due to more breakage anyway because of the value/reference issues with RNG types.

Regarding that, I have a point on designing a randomness library. Right now, most of what I have seen has at most two layers: the core RNG providing random bits, and the various uses of these bits, like uniform distribution on a segment, random shuffle and so on. It is comfortable when the elements of the two layers are independent, and you can compose different first layers (LCG, MT19937, or maybe some interface to /dev/*random) with different second layer functions (uniform[0,9], random_shuffle, etc.). Still, many of the useful second level functions build upon uniform distribution for integers on a segment. Thus I would like to have an explicit intermediate layer consisting of uniform and maybe other distributions which could also have different (fast vs. exact) implementations to choose from. In the long run, such design could also solve reproducibility problems: we can provide another implementation of uniform as the default, but it is still easy to set the previous one as the preferred intermediate level.

Ivan Kazmenko.

Reply via email to