Re: Mir Random [WIP]

Ilya Yaroshenko via Digitalmars-d Wed, 23 Nov 2016 07:55:59 -0800

On Wednesday, 23 November 2016 at 13:41:25 UTC, AndreiAlexandrescu wrote:

On 11/23/2016 12:58 AM, Ilya Yaroshenko wrote:
On Tuesday, 22 November 2016 at 23:55:01 UTC, AndreiAlexandrescu wrote:
On 11/22/16 1:31 AM, Ilya Yaroshenko wrote:
- `opCall` API instead of range interface is used (similarto C++)
This seems like a gratuitous departure from common Dpractice. Randomnumber generators are most naturally modeled in D as infiniteranges.
-- Andrei
It is safe low level architecture without performance and APIissues.
I don't understand this. Can you please be more specific? Idon't see a major issue wrt offering opCall() vs.front/popFront. (empty is always true.)

A range to use it with std.algorithm and std.range must becopyable (it is passed by value.

It
prevents users to do stupid things implicitly (like copyingRNGs).
An input range can be made noncopyable.


Ditto. A noncopyable input range is useless.

A
hight level range interface can be added in the future (itwill hold a
_pointer_ to an RNG).
Is there a reason to not have that now?


Done. See `RandomRangeAdaptor`:
https://github.com/libmir/mir-random/blob/master/source/random/algorithm.d

In additional, when you need to write algorithms
or distributions opCall is much more convenient than range API.
Could you please be more specific? On the face of it I'd agreeone call is less than two, but I don't see a major drawbackhere.

The main reason in implementation simplicity. Engines should besimple to create,simple to maintain, and simple to use. opCall is more simple thenrange interface because1. One declaration instead of 4 (3 range functions for pluslatest generated value (optional))

2. Input range is useless if range is not copyable.

3. `randomRangeAdaptor` is implemented for Engines and will bedone for Distributions too. So range API is supported better thenin std.range (because Engines are copied).

In
additions, users would not use Engine API in 99% cases: theywill just
want to call `rand` or `uniform`, or other distribution.
I am sure that almost any library should have low level APIthat is fitsto its implementation first. Addition API levels also may beadded.
Is there a large difference between opCall and front/popFront?
Actually I can think of one - the matter of getting thingsstarted. Ranges have this awkwardness of starting theiteration: either you fill the current front eagerly in theconstructor, or you have some sort of means to detectinitialization has not yet been done and do it lazily upon thefirst use of front. The best strategy would depend on theactual generator, and admittedly would be a bit more of aheadache compared to opCall. Was this the motivation?


Simplicity is main motivation.

### Example of API+implementation bug:
#### Bug: RNGs has min and max params (hello C++). But, theyare notused when an uniform integer number is generated :`uniform!ulong` /
`uniform!ulong(0, 100)`.
#### Solution: In Mir Rundom any RNGs must generate all8/16/32/64 bits
uniformly. It is RNG problem how to do it.
Min and max are not parameters, they are bounds provided byeach generator. I agree their purpose is unclear. We couldrequire all generators to provide min = 0 and max =UIntType.max without breaking APIs. In that case we only needto renounce LinearCongruentialEngine with c = 0 (seehttps://github.com/dlang/phobos/blob/master/std/random.d#L258)- in fact that's the main reason for introducing min and max inthe first place. All other code stays unchanged, and we caneasily deprecate min and max for RNGs.
(I do see min and max used by uniform athttps://github.com/dlang/phobos/blob/master/std/random.d#L1281so I'm not sure I get what you mean, but anyhow the idea thatwe require RNGs to fill an uint/ulong with all random bitssimplifies a lot of matters.)

Current Mir solution looks like pair isURBG and isSURBG. `S`prefix means `T.max == ReturnType!T.max` where T is an Engine.So, functions use isSURBG now. The min property is not required:we can just subtract actual min from a returning value.


An adaptor can be added to convert URBG to Saturated URBG.

I will not fill this bug as well another dozen std.random bugsbecausethe module should be rewritten anyway and I am working on it.std.randomis a collection of bugs from C/C++ libraries extended with Dgenericidioms. For example, there is no reason in 64 bit Xorshift. Itis 32 bitby design. Furthermore, 64 expansion of 32 bit algorithms mustbe provedtheoretically before we allow it for end users. 64 bit analogsare
exists, but they have another implementations.
One matter that I see is there's precious little differencebetween mir.random and std.random. Much of the code seemscopied, which is an inefficient way to go about things. Weshouldn't fork everything if we don't like a bit of it, thoughadmittedly the path toward making changes in std is moredifficult. Is your intent to work on mir.random on the side andthen submit it as a wholesale replacement of std.random under adifferent name? In that case you'd have my support, but you'dneed to convince me the replacement is necessary. You'dprobably have a good case for eliminating xorshift/64, but thenwe may simply deprecate that outright. You'd possibly have amore difficult time with opCall.

I started with Engines as basis. The library will be verydifferent comparing with Phobos and _any_ other RNG libraries interms of floating point generation quality. All FP generation Ihave seen are not saturated (amount of possible unique FP valuesare very small comparing with ideal situation because of IEEEarithmetic). I have not found the idea described by others, so itmay be an article in the future.

A set of new modern Engines would be added (Nicholas Wilson, andmay be Joseph). Also Seb and I will add a set of distributions.

Phobos degrades because
we add a lot of generic specializations and small utilitieswithout
understanding use cases.
This is really difficult to parse. Are you using "degrades" theway it's meant? What is a "generic specialization"? What areexamples of "small utilities without understanding use cases"?


Sorry, my English is ... .

It is not clear to me what subset of generic code is nothrow(reduce, for example). The same true for BetterC concept: it ishard to predict when an algorithms requires DRuntime to be linked/ initialised. It is not clear what modules are imported by anmodule.


"small utilities without understanding use cases" -
Numeric code in std.algorithm:

minElement, sum. They should not be in std.algorithm. A user canuse `reduce`. Or, if speed is required we need to move to numericsolution suitable for vectorization. And std.algorithm seems tobe wrong module for vectorised numeric code.

Phobos really follows stupid idealistic idea:
more generic is better, more API is better, more universalalgorithms isbetter. The problems is that Phobos/DRuntime is soup where all(because
its "universality") interacts with everything.
I do think more generic is better, of course within reason. Itwould be a tenuous statement that generic designs in Phobossuch as ranges, algorithms, and allocators are stupid andidealistic. So I'd be quite interested in hearing more aboutthis. What's that bouillabaisse about?

For example std.allocator. It is awesome! But I can not use it inGLAS, because I don't understand if it will work without linkingwith DRuntime.


So, I copy-pasted and modified your code for AlignedMallocator:
https://github.com/libmir/mir-glas/blob/master/source/glas/internal/memory.d

ranges, algorithms seem good to me except it is not clear whencode is nothrow /BetterC. std.math is a problem: we are addingnew API without solving existing API problems and Ccompatibility. std.complex prevents math optimisations (this cannot be solved without a compiler hacks), GLAS migrated to native(old) complex numbers.

I like generics when they make D usage simpler. If one will add arandom number generation for Phobos sorting algorithm it willmake it useless for BetterC (because it will require to linkRNG). Such issues are not reviewed during Phobos review process.Linking Phobos / DRuntime is not an option because it has notbackward binary compatibility, so packages can not be distributedas precompiled libraries.

std.traits, std.meta, std.range.primitives, std.ndslice, and partof std.math is only modules I am using in Mir libraries.

It is very important to me to have BetterC guarainties betweendifferent Phobos versions. Super generic code when differentmodules imports each other is hard to review.


Best regards,
Ilya

Re: Mir Random [WIP]

Reply via email to