I don't see any downside to this.
Which "this" do you mean? My current approach, or the adding
of an extra separate function? :-)
Your current approach.
I have only been thinking about the Ziggurat algorithm, but
you are right, it
does depend on the details of the technique. For Box-Muller
(and other engines
that cache samples) it only makes sense to compute the first
samples in the
opCall. But for the Ziggurat algorithm, tables that must be
computed before you
can start sampling aren't changed during sampling and
computing the tables
doesn't require any additional arguments. So it makes the most
sense for those
tables to be initialized in the struct's constructor in the
struct based API.
So we should assume by default then that the struct's
constructor should take an RNG as input, to enable it to
calculate these first values if it needs to?
No, I think that if the engine defines initialize() function
(with no parameters), it should be called in the constructor of
Normal. I don't think the constructor of Normal should take an
RNG as input. I think what you currently do in normal.d is fine,
I would just add something like this:
static if(is(typeof(_engine.initialize())))
_engine.initialize();
to Normal's constructor. Then ZigguratEngine can define
initialize() and do its initialization there (it doesn't need a
uniform RNG for initialization) and NormalBoxMullerEngine can
remain unchanged.
In my previous post I was talking about initializing static
instances of the
engine used in the normal() function. The advantage of
initializing in a static
constructor is that you don't need an additional check every
time the normal()
function is called. But because we will also have a struct
based API, that will
not require such checks (at least not for all engines), this
isn't really that
important. So we can also initialize the global engine
instance in a call to
normal(), if this simplifies things.
I guess my feeling here is that the values generated by an RNG
should depend on when it is called, and not at all on when it
is instantiated.
i.e. if I do something like
auto nrng = Normal!()(0, 1);
writeln( uniform(0.0, 1.0) );
writeln( uniform(0.0, 1.0) );
writeln( nrng() );
writeln( nrng() );
I should get the same output as if I do,
writeln( uniform(0.0, 1.0) );
writeln( uniform(0.0, 1.0) );
auto nrng = Normal!()(0, 1);
writeln( nrng() );
writeln( nrng() );
You can also think that if I change from e.g.
auto nrng = Normal!(real, Engine1)(0, 1);
writeln( uniform(0.0, 1.0) );
writeln( uniform(0.0, 1.0) );
writeln( nrng() );
writeln( nrng() );
to
auto nrng = Normal!(real, Engine2)(0, 1);
writeln( uniform(0.0, 1.0) );
writeln( uniform(0.0, 1.0) );
writeln( nrng() );
writeln( nrng() );
... then I would expect to see different results from the
normal RNG but identical results from uniform(). If the
constructor of the normal engine calls the RNG, the uniform()
results will change, no?
I was only talking about the part of initialization that doesn't
use a RNG. I agree that everything that uses a RNG should be done
in opCall (or inside a normal() function in the function
interface). For Box-Muller, I think the approach you currently
use in NormalBoxMullerEngine is the most reasonable one.
But a Ziggurat engines needs to compute some tables before it can
start generating samples. It doesn't need a RNG to do that and
the tables do not change after initialization.
I think it's obvious that that initialization that doesn't need a
RNG should be done in Normal's constructor for the struct
interface. What is not so obvious is where the initialization of
static data that doesn't require a RNG should be done for
function interface. That initialization can be done in normal()
function, or it can be done in a static constructor. If you do it
in normal(), you need to do an extra check on each call to
normal(). This isn't really a problem as long as the struct's
opCall and the version of normal() that takes the engine as a
parameter don't do such redundant checks. Then the users that
care about the difference in performance can just use one of
these interfaces.
Yes, a random.d test suite probably should be another project.
Regardless of tests, let's focus for now on getting the API
right for this case of
non-uniform-random-number-generator-with-internal-engine, with
normal and exponential as the initial cases.
In general I like the API in file normal.d attached to your
original post. I think the engines should have an option to do
some initialization in Normal's constructor, though. We could
achieve that by calling _engine.initialize in Normal's
constructors, if such method exists. This method would also need
to be called on the static instance of normal engine used in the
normal() function. We could add something like this to the first
version of normal:
static if(is(typeof(engine.initialize())))
{
static bool isInitialized;
if(!isInitialized)
engine.initialize();
}
Another option would be to do this:
struct GlobalEngine(Engine)
{
Engine instance;
static this()
{
instance.initialize();
}
}
And then inside the version of normal that doesn't take an engine
as a parameter:
alias GlobalEngine!NormalRandomNumberEngine E;
return normal(mean, sigma, E.instance, urng);
The users would need to construct their own instance of engine to
use the function that takes engine as a parameter. So it would
make sense to add helper functions for creating engine instances.
There's one change that I think would make the API more
convenient. Normal struct and the engine don't store an instance
of a RNG , so they don't need to take it as a template parameter.
We could make opCall methods templates instead. That way the
users would never need to explicitly specify the type of the RNG.