On 12/05/14 20:17, Nick Sabalausky via Digitalmars-d wrote:
On 5/11/2014 8:16 AM, Joseph Rushton Wakeling via Digitalmars-d wrote:> On
11/05/14 05:58, Nick Sabalausky via Digitalmars-d wrote:
 >> The seed doesn't need to be compromised for synchronized RNGs to fubar
 >> security.
 >
 > Before we go onto the detail of discussion, thanks very much for the
 > extensive explanation.  I was slightly worried that my previous email
 > might have come across as dismissive of your (completely understandable)
 > concerns.

Oh, not at all. I've been finding the discussion rather interesting. :)

Me too -- sorry for the late response, things have been busy :-)

 > Obviously one _can_ solve the problem by the internal state variables of
 > the RNG being static, but I'd suggest to you that RNG-as-reference-type
 > (which doesn't necessarily need to mean "class")

Yea, doesn't necessarily mean class, but if it is made a reference type then
class is likely the best option. For example, I'd typically regard struct* in a
D API as a code smell.

Well, I wasn't going to suggest struct* as the type. There have been various proposals here for a RefTypeOf template that stores an internal pointer to a struct instance and exposes its public interface via alias this. Unfortunately this approach is probably problematic because of this issue:
https://issues.dlang.org/show_bug.cgi?id=10996

One could also just have an internal pointer to the RNG's private state variable(s). monarch_dodra and I have both prototyped some designs for this, but I do agree that class is largely preferable, because it avoids the need for the developer to take responsibility for ensuring the reference type semantics.

So I'm fine going the class route (or otherwise reference-based) and making
internal state per-instance. Or even having a "duplicate this RNG with identical
state" function, if people want it.

I think we're pretty well agreed on non-crypto RNGs. Your stance is convincing
here.

Excellent. :-)

For crypto-RNGs:

  [ ... snip ... ]

I think my preference would still be to keep the internal state static here
though (again, just speaking for crypto-RNGs only). As I've argued, the
determinism is a non-feature for crypto-RNGs (they deliberately fight it every
step of the way), and the shared state carries a couple entropy-related benefits
(Reseeding one, ie accumulating additional fresh entropy, benefits all others.
And all RNG activity within in the thread acts as additional entropy). So I only
see upsides, not downsides.

Thanks for the excellent and detailed explanation here. Your case is also pretty convincing. A few remarks, though.

First, one concern I still have with static internals is essentially the same as the issue I have with the reference-types-made-of-structs: it's relying on the programmer to "do the right thing", and you know that someone is going to forget to mark as static a variable that needs it. With any luck that will be an easily-spotted and fixed issue, but using a class avoids the need.

So, I'd still feel more comfortable with the idea of crypto-RNGs being classes and not structs -- you can still have the static internals to deal with your desire for uniqueness, of course.

Second, I think your idea about separating the deterministic part of the algorithm from the source of entropy, and allowing arbitrary sources (or combinations of sources) to be plugged in, is an interesting one and worth pursuing.

The only potential issue I see: Are there going to be people who are looking for
determinism in an RNG and intentionally want to use something like Hash_DRBG for
that. Maybe for the sake of it's usage of SHA. They would have to either limit
the number of values they generate (to avoid the algorithm's internal reseeding)
or else provide their own entropy source (not too difficult though, since I've
recently parametrized HashDRBG on that). But I suppose it could be done.

To be honest, I wouldn't worry about anyone wanting to use a crypto RNG algorithm deterministically. Wait for someone to request the feature. :-)

 > The benefit of doing it this way, as opposed to static internal
 > variables, is that you aren't then constrained to have only one single,
 > effectively thread-global instance of _every_ RNG you create.

Non-crypto RNGs: Yea, that's good, I like it.

Crypto RNGs: Hmm, good question. My concern is that it's something the user has
to specifically choose to use. It's not what automatically happens when you ask
for an instance of a specific RNG.

Yea, this is a good point. I think you've convinced me that the natural state of a crypto RNG is that its state should essentially be unique -- not per-instance.

One remark: if you can separate out your algorithm into a deterministic algorithm templated on sources of entropy, then note that each instantiation will be unique _relative to the sources of entropy_ but that one could create multiple independent instances relying on _different_ sources of entropy.

Fair enough, at least for non-crypto RNGs. For crypto RNGs, I'm thinking now
that static state can be avoided as long as the design still does a sufficient
job of steering actual crypto-purpose users away from multiple separate 
instances.

Having got this far through the discussion, I feel that I'm happy with the idea of static state for crypto RNGs, but equally I'll be happy with alternatives. I probably do have a bit of a personal inclination to avoid static if at all possible, but in this case I think you've made a very reasonable argument for it. (Some of the arguments I cited earlier, like the effect on function purity, etc., don't apply here because crypto RNGs' .popFront() is of necessity going to be non-pure.)

 > I wrote a class-based implementation because I wanted to see how that
 > came out to various struct-based approaches monarch_dodra and I had
 > tried.  I do think that it offers much value in terms of simplicity and
 > elegance of design, but there may be other costs that make it
 > untenable.
 >

Have you found any such costs yet, or anything in particular that suggests there
may be some? Intuitively, I wouldn't think the minor amount of (by default) GC
heap usage would matter (just as one particular aspect of classes).

Well, the main concern would be if using classes made it impossible (or frustratingly difficult) to use the RNG package's full functionality in non-GC-using code.

I don't think there are any issues like speed hits.

Again, for non-crypto RNGs, I'm totally with you now.

For crypto-RNGs, I think it's unclear. The question hinges on:

A. "Can we permit deterministic usages without making crypto users more likely
to have multiple independent instances within a thread? (regardless of whether
crypto users do it intentionally or unintentionally)"

Honestly, if an RNG is designed as a crypto RNG, I don't think you need to worry about supporting deterministic usage. The main concern of my arguments was related to non-crypto pseudo-RNG use-cases.

That question, of course, needs to be balanced with:

B. "Is 'non-crypo deterministic' an inappropriate misuse of crypto-RNGs (when we
already have RNGs designed for non-crypo deterministic purposes anyway?)"

Probably. :-)


and C. "How much should we even care about B?"

I'm concerned that the global-singleton-instance pattern may be insufficient for
"A".

I don't think that matters. We don't need to support deterministic usages for crypto RNGs.

What I think I'd like to see is this: For *all* RNGs (crypto and non-crypto), a
reference-based design where obtaining a common thread-global instance is always
the implicit default behavior, and individual or duplicate instances can always
be obtained *explicitly* in a way that cannot be misread/misinterpreted. That
would succeed at "A" and render "B" and "C" non-issues. Plus, I think it would
be generally appropriate, even for deterministic non-crypto RNGs, anyway.

Well, I guess that what I feel is: the general class-based approach of std.random2 handles most of this. Where crypto RNGs are concerned I'm fine with the idea of the internal state being static if you feel that will maximize effective use of the entropy supplied.

If we combine that with the idea of templating the deterministic parts of crypto RNGs on their sources of entropy, then it should be clear that there _can_ be multiple independent instances of a crypto RNG if that's desired, but the user needs to provide different sources of entropy to each in order to make that happen.

Then, provide a sensible "default" version of each crypto RNG type, with explicitly specified entropy sources, so that the typical version that will be instantiated by users will (i) be suitable for crypto and (ii) assuming it does have static internals, will be unique per thread.

Also, I don't want to forget the issue of stream interfaces. What do you think
about including them, but just with a big red "Subject to change pending a new
std.stream" banner in the docs? I think that's a perfectly pragmatic "best of
both worlds" compromise. Think it would be well/poorly-received?

If the DConf discussion related to an experimental part of the standard library are anything to go by, I think we will have plenty of opportunity to implement functionality that is subject to change, so I don't think we need fear doing that.

Reply via email to