Hi

On 2/14/22 15:53, Go Kudo wrote:
1) XorShift128+ has a 128 Bit internal state, but takes an integer seed
within the constructor. Thus only 64 Bits of seed can be provided.

This is for convenience. Other software that uses XorShift128+, such as
Chromium (V8), also uses a 64-bit value for the initial seed value.
I think that 128-bit value seeding with strings is unintuitive and not very
good for performance.

https://chromium.googlesource.com/v8/v8/+/refs/heads/main/src/base/utils/random-number-generator.h

I don't think performance for seeding the RNG matters. That's an operation you only perform once (or a small number of times). The time for the actual generation of random numbers most certainly dwarfs the time used for seeding.

Regarding "unintuitive": I disagree. I find it unintuitive that there are some RNG sequences that I can't access when providing a seed.

I wouldn't object a convenience constructor that also accepts an integer, but I believe the default should be a seed that is appropriate to generate the full state space.

2) I would adjust the 'Randomizer' to use the 'Secure' generator as a
safe default. If absolute performance or a reproducible sequence is
required then one can use a custom generator, but the default will be the
secure CSPRNG, making it harder to misuse.

Certainly, this may be appropriate. But, in this case, the Randomizer
generated with the default parameters will not be serializable. Is that
acceptable?

I consider that acceptable. If I want to serialize the randomizer then I want a reproducible sequence and in that case I should also be forced to explicitly decide what type of reproducible sequence I want. If I don't explicitly decide, then newly generated randomizers might use an entirely different generator if the default changes.,
4) The RFC should document the 'NumberGenerator' interface. Specifically
I'm interested in the return type of the 'generate' method. Does it return
bytes or integers? Is it legal to implement the interface in userland code?

It returns an int. Also, as pointed out in GH, the generated value is
implicitly treated as the size equivalent of PHP_INT_SIZE (zend_long) on
the environment. This means that it is not possible to implement the
userland Mersenne twister (32-bit) in a 64-bit environment.

I think the returned value of the generator should be a string containing raw bytes.

It's very easy to interpret a bytestring as an appropriate integer (e.g. using unpack()), but if some consumer needs a bytestring then turning the integer back into a bytestring without accidentally introducing biases is much harder, because of platform differences and the lack of unsigned integers in userland.

Unfortunately your PR doesn't compile for me, so I can't test:

make: *** No rule to make target 'php-src/ext/standard/lcg.c', needed by 'ext/standard/lcg.lo'. Stop.

Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to