Re: New Lagged Fib. PRNG gen and random2.d

Joseph Rushton Wakeling Thu, 29 Aug 2013 03:42:58 -0700

On 27/08/13 08:16, monarch_dodra wrote:

What bothers me about these is that they have a fixed seed value,
which (IMO) is even worst than default seeding to
"unpredicatableSeed"


C's "rand" also "defaults seed". I've "tutored" C++ on codeguru
for years, and years in and years out, I'd get asked "why does my
random program keep creating the same output". This behavior
surprises people.

A PRNG should either error-out on no-seed, or be run-time
unpredictably seeded. The middle ground just gives you the worst
of both...

For what it's worth, I think that your experience is a consequence of theparticular case of C. The "easy" way to generate a random number in C is viasome variation on rand() / RAND_MAX , so that's what people do, and so they gethit by the default seed.

On the other hand with D the default random number generator is rndGen, which isunpredictably seeded, and which is used if no other RNG is specified -- so_default_ random behaviour in D mimics that in interpreted languages and isdifferent per program run. (Actually, I sometimes find myself seeding rndGen inorder to get _the same_ results, which of course sometimes you do want:-)

Personally I think there may be some value in having an accepted defaultconfiguration for RNGs, so long as it's correctly signposted what will happen,and so long as the "easy" thing to do is not going to fall into the trap youdescribed.

However, I think initialization is an important issue not just for RNGs but forthe diversity of other entities that use them. This also impacts on theclass/struct discussion.

For example, with RNGs per se it's pretty trivial (with a class-based approach)that we have a constructor requiring a seed, and in addition (possibly) adefault constructor that seeds with some default condition (let's leave asidepreferences on this for now:-).

The latter default-seed approach cannot be implemented with structs -- you can'thave a no-parameter constructor -- so struct-based RNGs have to work round thisin one of two ways: either by having default settings for all the internalvalues of the RNG state data (which e.g. Xorshift can do because it's a smalltotal number of parameters), or by having conditionals which get triggered whenfront(), popFront() etc. are called, as in the Mersenne Twister implementation,which calls seed() if the value of the internal parameter mti is equal tosize_t.max.

The latter approach is very annoying because it means that e.g. front() cannotbe const, which we'd like it to be.

So, all of that makes final classes a nice approach, albeit we might compromisefor other reasons. But it's not so simple for other random functions. Nowconsider random sampling, and the following 3 versions of code:


    {   // 1
        auto gen = Random(unpredictableSeed);
        auto sample = randomSample(iota(100), 10, gen);
        writeln(sample);
        writeln(uniform(0.0, 1.0, gen));
    }

    {   // 2
        auto gen = Random(unpredictableSeed);
        auto sample = randomSample(iota(100), 10, gen);
        writeln(uniform(0.0, 1.0, gen));
        writeln(sample);
    }

    {   // 3
        auto gen = Random(unpredictableSeed);
        writeln(uniform(0.0, 1.0, gen));
        auto sample = randomSample(iota(100), 10, gen);
        writeln(sample);
    }

What would you _expect_ to happen to the values of the random number and therandom sample in each case?

The most intuitive thing would be that the values of the sample are determinedat the point it's created, so they would depend on where the line sample = ...occurs and on nothing else. However, we know that random samples are lazilyevaluated.

So, perhaps the second logical alternative is that its values should bedetermined _when it is read_, i.e. at the point where we writeln(sample).

Now consider -- when is the _first_ "front" value of the sample set? If it'sset at construction time (which would be normal for a range), the remainingvalues in the sample will depend on what calls to the RNG are put in in-betweenconstruction and reading. So, in this case, the samples from programs 1 and 2will have the same _first_ value, but different subsequent values.


That, to me, is both unintuitive and undesirable.

Now it gets more complicated. In the case of RandomSample, we have a bunch ofdifferent public functions we can call. When front() is called, we need tocheck if the sample has been initialized before we return a value. Similarly,popFront() should behave differently depending on whether the first sample valuehas been determined yet. Then there is a method index() that returns thenumerical index of the sampled value (so, if you sample from 0, 1, 2, .... youshould have sample.front == sample.index always), and this too clearly dependson the sample being initialized before it can return.

None of this can be solved with a struct vs. class difference, assuming we agreethat it's inherently undesirable to initialize at construction time.

This problem is potentially a rich source of bugs, as we saw with RandomCover inour recent pull request discussion. Relying on manual solutions seemsundesirable -- you will get people (as I did) fixing front and popFront(), butforgetting about a function like RandomSample.index(). (Fix in the works:-)

So, I'm wondering if there is a potential for a syntax sugar like invariant(),but which will not get kicked out when compiled with -release and which willenable essential checking of the internal state of the system whenever a publicmethod is called.


I'd suggest there actually be two, one to check entry condition, one exit.

Re: New Lagged Fib. PRNG gen and random2.d

Reply via email to