1st draft of complete class-based std.random successor

Joseph Rushton Wakeling Wed, 19 Mar 2014 16:51:54 -0700

Hello all,

As some of you may already know, monarch_dodra and I have spentquite a lot of time over the last year discussing the state ofstd.random. To cut a long story short, there are significantproblems that arise because the current RNGs are value typesrather than reference types. We had quite a lot of back andforth on different design ideas, with a lot of helpful input fromothers in the community, but at the end of the day there arereally only two broad approaches: create structs that implementreference semantics internally, or use classes. So, as anexercise, I decided to create a class-based std.random.

The preliminary (but comprehensive) results of this are nowavailable here:

https://github.com/WebDrake/std.random2

Besides re-implementing random number generators as classesrather than structs, the new code splits std.random2 into apackage of several different modules:


   * std.random2.generator, pseudo-random number generators;

   * std.random2.device, non-deterministic random sources;

* std.random2.distribution, random distributions such asuniform,

     normal, etc.;

   * std.random2.adaptor, random "adaptors" such as randomShuffle,
     randomSample, etc.

   * std.random2.traits, RNG-specific traits such as isUniformRNG
     and isSeedable.

A package.d file groups them together so one can still import alltogether via "import std.random2". I've also taken the libertyof following the new guideline to place import statements aslocally as possible; it was striking how easy and clean this madethings, and it should be easy to port that particular change backto std.random.

The new package implements all of the functions, templates andrange objects from std.random except for the oldstd.random.uniformDistribution, whose name I have cannibalizedfor better purposes. Some have been updated: theMersenneTwisterEngine has been tweaked to match the correspondingcode from Boost.Random, and this in turn has allowed thedefinition of a 64-bit Mersenne Twister (Mt19937_64) and analternative 32-bit one (Mt11213b).

There are also a number of entirely new entries.std.random2.distribution contains not just existing functionssuch as dice and uniform, but also range-based randomdistribution classes UniformDistribution, NormalDistribution andDiscreteDistribution; the last of these is effectively arange-based version of dice, and is based on Chris Cain'sexcellent work here:https://github.com/D-Programming-Language/phobos/pull/1702

The principal weak point in terms of functionality isstd.random2.device, where the implemented random devices (basedon Posix' /std/random and /std/urandom) are really very primitiveand just there to illustrate the principle. However, since theirAPI is pretty simple (they're just input ranges with min and maxdefined) there should be plenty of opportunity to improve andextend the internals in future. Advice and patches are welcomefor everything, but particularly here :-)

What's become quite apparent in the course of writing thispackage is how much more natural it is for ranges implementingrandomness to be class objects. The basic fact that anotherrange can store a copy of an RNG internally without creating acopy-by-value is merely the start: for example, in the case ofthe class implementation of RandomSample, we no longer need tohave complications like,


    @property auto ref front()
    {
        assert(!empty);
        // The first sample point must be determined here to avoid
        // having it always correspond to the first element of the

// input. The rest of the sample points are determinedeach

        // time we call popFront().
        if (_skip == Skip.None)
        {
            initializeFront();
        }
        return _input.front;
    }

that were necessary to avoid bugs likehttps://d.puremagic.com/issues/show_bug.cgi?id=7936; because theclass-based implementation copies by reference, we can justinitialize everything in the constructor. Similarly, issues likehttps://d.puremagic.com/issues/show_bug.cgi?id=7067 andhttps://d.puremagic.com/issues/show_bug.cgi?id=8247 just vanish.

Obvious caveats about the approach include the fact that classesneed to be new'd, and questions over whether allocation on theheap might create speed issues. The benchmarks I've run (codeavailable in the repo) seem to suggest that at least the latteris not a worry, but these are obviously things that need to beconsidered. My own feeling is that ultimately it is aresponsibility of the language to offer nice ways to allocateclasses without necessarily relying on new or the GC.


A few remarks on design and other factors:

* The new range objects have been implemented as final classesforspeed purposes. However, I tried another approach where theRNG

     class templates were abstract classes, and the individual
     parameterizations were final-class subclasses of those rather

than aliases. This was noticeably slower. My OO-fu is notreallysufficient to explain this, so if anybody can offer areason, I'd

     be happy to learn it.

* A design question I considered but have not yet pursued:since atleast two functions require passing the RNG as the firstparameter(dice and discreteDistribution), perhaps this should be madeageneral design pattern for everything? It would make itharder toadapt code using the existing std.random but would create auseful

     uniformity.

* I would have liked to ensure that every random distributionhadboth a range- and function-based version. However, I cameto theconclusion that solely function-based versions should beavoidedif either (i) the function would need to maintain internalstatebetween calls, or (ii) the function would need to allocatememoryper call. The first is why for example NormalDistributionexistsonly as a class/range. The second might in principle raisesomeobjections to dice, but as dice seems to be a reasonablystandard

     function, I kept it in.

* It might be good to implement helper functions for theindividualRNGs (e.g. just as RandomSample has a randomSample helperfunction

     to deliver instances, so Mt19937 could have a corresponding

mt19937 helper function returning Mt19937 instances seededin line

     with helper function parameters).

* Those with long memories may recall that when I originallywroteup my NormalDistribution code, it was written to allowvarious"normal engines" to be plugged in; mine was Box-Muller, butjerroalso contributed a Ziggurat-based engine. This could stillbeprovided here, although my own inclination is that it'sprobablybest for Phobos to provide one singlegood-for-general-purpose-use

     implementation.

Known issues:

* While every bugfix I've made in the course of implementingthispackage has been propagated back to std.random wherepossible,this package is missing some of the more recent improvementstostd.random by other people (e.g. I think it's missing ChrisCain's

     update to integer-based uniform()).

* The unittest coverage is overall pretty damn good, but thereare

     weak spots in std.random.distribution and std.random2.device.
     Some of the "unittests" in these cases are no more than basic

developer sanity checks that print results to console, andneedto be replaced by well-defined, silent-unless-failedalternatives.

* Some of the .save functions are implemented with the help ofratherodd private constructors; it would probably be much betterto redo

     these in terms of public this(typeof(this)) constructors.

* The random devices _really_ need to be better. Consider thecurrent

     versions as placeholders ... :-)

Finally, a note on authorship: since this is still based verysubstantially on std.random, I've made an effort to check gitlogs and ensure that authors and copyright records (and dates ofcontribution) are correct. My general principle here has beenthat listed authors should only include those who've made asubstantial contribution (i.e. whole functions, large numbers ofunittests, ...), not just various 1-line tweaks. But if anyonehas any objection to any of the names, dates or other creditsgiven, or if anybody would like their name removed (!), just letme know.

I owe a great debt of gratitude to many people here on theforums, and monarch_dodra in particular, for a huge amount ofuseful discussion, advice and feedback that has made its way intothe current code. Thank you all for your time, thoughts, ideasand patience.

Anyway, please feel free to review, destroy and otherwise do funstuff with this module. I hope that some of you will find itimmediately useful, but please note that feedback and advice mayresult in breaking changes -- this is intended to wind up inPhobos, so it really needs to be perfect when it does so. Let'sreview it really well and make it happen!


Thanks and best wishes,

    -- Joe

1st draft of complete class-based std.random successor

Reply via email to