I don't see any downside to this.

Which "this" do you mean? My current approach, or the adding of an extra separate function? :-)

Your current approach.

I have only been thinking about the Ziggurat algorithm, but you are right, it does depend on the details of the technique. For Box-Muller (and other engines that cache samples) it only makes sense to compute the first samples in the opCall. But for the Ziggurat algorithm, tables that must be computed before you can start sampling aren't changed during sampling and computing the tables doesn't require any additional arguments. So it makes the most sense for those tables to be initialized in the struct's constructor in the struct based API.

So we should assume by default then that the struct's constructor should take an RNG as input, to enable it to calculate these first values if it needs to?

No, I think that if the engine defines initialize() function (with no parameters), it should be called in the constructor of Normal. I don't think the constructor of Normal should take an RNG as input. I think what you currently do in normal.d is fine, I would just add something like this:

static if(is(typeof(_engine.initialize())))
    _engine.initialize();

to Normal's constructor. Then ZigguratEngine can define initialize() and do its initialization there (it doesn't need a uniform RNG for initialization) and NormalBoxMullerEngine can remain unchanged.

In my previous post I was talking about initializing static instances of the engine used in the normal() function. The advantage of initializing in a static constructor is that you don't need an additional check every time the normal() function is called. But because we will also have a struct based API, that will not require such checks (at least not for all engines), this isn't really that important. So we can also initialize the global engine instance in a call to
normal(), if this simplifies things.

I guess my feeling here is that the values generated by an RNG should depend on when it is called, and not at all on when it is instantiated.

i.e. if I do something like

    auto nrng = Normal!()(0, 1);
    writeln( uniform(0.0, 1.0) );
    writeln( uniform(0.0, 1.0) );
    writeln( nrng() );
    writeln( nrng() );

I should get the same output as if I do,

    writeln( uniform(0.0, 1.0) );
    writeln( uniform(0.0, 1.0) );
    auto nrng = Normal!()(0, 1);
    writeln( nrng() );
    writeln( nrng() );

You can also think that if I change from e.g.

    auto nrng = Normal!(real, Engine1)(0, 1);
    writeln( uniform(0.0, 1.0) );
    writeln( uniform(0.0, 1.0) );
    writeln( nrng() );
    writeln( nrng() );

to

    auto nrng = Normal!(real, Engine2)(0, 1);
    writeln( uniform(0.0, 1.0) );
    writeln( uniform(0.0, 1.0) );
    writeln( nrng() );
    writeln( nrng() );

... then I would expect to see different results from the normal RNG but identical results from uniform(). If the constructor of the normal engine calls the RNG, the uniform() results will change, no?

I was only talking about the part of initialization that doesn't use a RNG. I agree that everything that uses a RNG should be done in opCall (or inside a normal() function in the function interface). For Box-Muller, I think the approach you currently use in NormalBoxMullerEngine is the most reasonable one. But a Ziggurat engines needs to compute some tables before it can start generating samples. It doesn't need a RNG to do that and the tables do not change after initialization.

I think it's obvious that that initialization that doesn't need a RNG should be done in Normal's constructor for the struct interface. What is not so obvious is where the initialization of static data that doesn't require a RNG should be done for function interface. That initialization can be done in normal() function, or it can be done in a static constructor. If you do it in normal(), you need to do an extra check on each call to normal(). This isn't really a problem as long as the struct's opCall and the version of normal() that takes the engine as a parameter don't do such redundant checks. Then the users that care about the difference in performance can just use one of these interfaces.

Yes, a random.d test suite probably should be another project.

Regardless of tests, let's focus for now on getting the API right for this case of non-uniform-random-number-generator-with-internal-engine, with normal and exponential as the initial cases.

In general I like the API in file normal.d attached to your original post. I think the engines should have an option to do some initialization in Normal's constructor, though. We could achieve that by calling _engine.initialize in Normal's constructors, if such method exists. This method would also need to be called on the static instance of normal engine used in the normal() function. We could add something like this to the first version of normal:

static if(is(typeof(engine.initialize())))
{
    static bool isInitialized;
    if(!isInitialized)
        engine.initialize();
}

Another option would be to do this:

struct GlobalEngine(Engine)
{
    Engine instance;

    static this()
    {
        instance.initialize();
    }
}

And then inside the version of normal that doesn't take an engine as a parameter:

alias GlobalEngine!NormalRandomNumberEngine E;
return normal(mean, sigma, E.instance, urng);

The users would need to construct their own instance of engine to use the function that takes engine as a parameter. So it would make sense to add helper functions for creating engine instances.

There's one change that I think would make the API more convenient. Normal struct and the engine don't store an instance of a RNG , so they don't need to take it as a template parameter. We could make opCall methods templates instead. That way the users would never need to explicitly specify the type of the RNG.

Reply via email to