On 09/05/14 02:42, Nick Sabalausky via Digitalmars-d wrote:
There can technically be multiple "instances", but yea, they're all effectively
tied together. However, I'm leaning towards the belief that's "correct behavior"
for a RNG. It's *definitely* correct for a crypto RNG - you certainly wouldn't
want two crypto RNGs ever generating the same sequence, not even deliberately.
That would defeat the whole point of a crypto RNG.

I'm not sure I follow your point about crypto RNGs here. Assuming it's deterministic at all, if someone else knows the seed state (which would be required to get two instances generating the same sequences), aren't you buggered anyway? And isn't the point of a deterministic crypto RNG that its seed state is kept secret, or that it is seeded with non-deterministic values upon instantiation?

As for ordinary non-crypto RNGs, I honestly can't imaging any purpose for
reliably generating the same values other than "playing back" a previous
sequence. But if that's what you want, then IMO it's better to record the
sequence of emitted values, or even record/replay the higher-level decisions
which the randomness was used to influence.

Even that will not save you in Minecraft, where the results of different seeds will produce different results in updated versions of the game where there are more random possibilities available (more monsters, more flora and fauna...). ;-)

For randomness, record/replay is just less fiddly and error-prone than reliably
regenerating values from an algorithm that intentionally tries to imitate
non-predictability. One slip-up while regenerating the sequence (example: trying
to replay from the same seed on a RNG that has since had a bug fixed, or on a
different architecture which the RNG wasn't well-tested on) and then the
inaccuracies just cascade. Plus this way you can swap in a non-deterministic RNG
and everything will still work. Or more easily skip back/ahead.

There's a difference between _relying_ on the deterministic sequence from a pseudo-RNG and the principle of design that says if I declare two separate RNG instances,

    SomeRNG rng1;
    SomeRNG rng2;

then the calls to rng1 should not have any effect on the results of the calls to rng2. Or alternatively, if I do,

    SomeRNG rng1;
    SomeRNG rng2;

    rng1.seed(n);
    rng2.seed(n);

    foreach (a, b; lockstep(rng1, rng2).take(10))
    {
        writefln("%s\t%s", a, b);
    }

then the values I get should be the same, because rng1 and rng2 are independent instances that have been identically seeded.

I'm just not seeing a legitimate use-case for multiple states of the same RNG
engine (at least on the same thread) that wouldn't be better served by different
approach.

We could get into broader discussions, but I don't think it needs to be more complicated than: if I create an instance of a variable of type T, then I should expect that it will be independent of any other instance of a variable of type T, unless I have deliberately created that link.

AIUI an RNG is impure by definition. (Well, unless you consider the internal
state to be part of the input, but I'd say *that* view breaks encapsulation.)

Not at all.  In fact a good theoretical model of a pseudo-RNG is

    * a state variable, and

    * a pure function that transforms one state variable into another
      such that the sequence of states is indistinguishable from a random
      sequence.

You can also add other pure transformations to map the state variable to particular random variates (e.g. unsigned integers in a certain range).

You can encapsulate this quite neatly as the state variable being a private member of a struct or class, with the pure transformation function being .popFront and the state-to-variate mapping being handled by .front. Note that .popFront _is_ pure for all the pseudo-RNGs in std.random2 :-)

I'm also unconvinced that it breaks encapsulation anyway. If anything, I think
it's better encapsulated since separate instances *won't* behave like they're
under "quantum entanglement". It forces separate instances to stay separate.

Cf. my earlier example with two individual RNG instances initialized with the same seed. This ought to produce two identical sequences; with your proposed static internals, it won't, because calls to rng1 will affect the results of subsequent calls to rng2, and vice versa. This is pretty much the definition of breaking encapsulation, no?

It *is* a fairly strange situation though. Unlike most encapsulation units, the
whole point of an RNG is to be *inconsistent* rather than consistent. So that
kinda turns everything on its ear.

The point of a pseudo-RNG is that if you analyse the sequence of variates produced, you can't distinguish it from a genuine random sequence. But it's also intended that different instances seeded with different values ought to produce statistically independent sequences with very high probability.

Your proposal would make it impossible to _have_ multiple effective instances seeded differently (unless in different threads).

Yea, when I wrote it I figured on proposing an optional stream-like interface
for std.random. The Hash_DRBG algorithm is inherently based around the idea that
each generated number can be arbitrary-length (not only that - generating ex.
two 8-byte values with Hash_DRGB is fundamentally different from generating
eight 2-byte values, and produces different results). And the OS-specific
entropy generators are basically streams, too (on both Win and Posix, you
request any number of bytes of randomness).

So that was just a natural way to implement it. But it did strike me as being
generally useful, so I'm glad that you agree.

You're right though, the current state of Phobos streams is kind of a stumbling
block. Naturally it'd be best to use the new stream interface, but since that
isn't in place, that's an issue. I don't know what the state of the rewrite is.
Been awhile since I've heard anything about it.

Yup, me neither.  I will try and quiz Steven on it at some point.

I figure, at the very least, we could just make the stream interfaces for RNGs
private. Then we can open it up whenever we finally do have a new std.stream to
conform to.

Or we could do an OutputRange instead...but the recent discussion on those makes
me a bit hesitant to do so. And I definitely don't wanna perform the contortions
to invert it into a non-coroutine-like InputRange.

Hmm, yea, I'm more and more liking the "keep it private for now, and then see
what happens" approach...unfortunately... :/

Yes, that's probably for the best.

Reply via email to