Re: [PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-06-01 Thread Nikita Popov
On Mon, May 31, 2021 at 4:04 PM Go Kudo  wrote:

> Hi Internals.
>

Hi :) Thanks for your continued work on this RFC. This is a tough one, with
a lot of feedback, some of it contradictory. I'm still going to add my 2c...

I think that this proposal can basically work out in two ways:

1. Pragmatic: Have a Random class that represents both the raw randomness
source and ways to use it.
2. Pure: Separate the randomness source from transformations based on it.

Your current design sits in a weird middle ground between these, where you
can extend the Random class to insert a user-provided randomness source.

I think you should pick between one or the other option. If you pick the
first option, then user-extensibility should simply not be supported at
all. The class should be final, and only support a limit set of
extension-provided RNGs. This is a pragmatic design that covers most
use-cases.

If you pick the second option, then you should consistently separate the
source for both extension-provided RNGs and user-provided ones. I don't
think it makes sense to have extension provided RNGs use a `new
Random(RANDOM_XORSHIFT128PLUS)` interface, while user-provided ones use
`new Random(new SomeRandomSource)`. Going down this route, it should be
`new Random(new XorShift128Plus)`. This is a pure design, which also means
that it's more complicated, and may make simple usages harder (though there
is certainly room for a Random::createDefault() here.)

As a minor note, I don't think we should add something like
random.ignore_generated_size_exceeded. Our general policy is to avoid the
introduction of ini settings that affect runtime behavior where possible.
We should pick one of the behaviors here and stick with it.

Regards,
Nikita

I apologize for the discussion outside the ML. Here's a brief history.
>
> https://github.com/php/php-src/pull/7079
>
> - To test the implementation, I sent a Pull-Request to php-src on GitHub.
> - An excellent point was made, and we ended up having a discussion about it
> on GitHub.
> - I was going to change the current implementation RFC to Under Discussion,
> but decided against it.
>
> Also, sorry for the delay in answering your questions. I am not very good
> at English and it is taking me a long time.
>
> The current implementation and RFC:
>
> - https://github.com/php/php-src/pull/7079
> - https://wiki.php.net/rfc/rng_extension
>
> However, the implementation of this user-defined class is very ugly and we
> intend to improve it as follows.
>
> Before:
> ```php
>
> class Random
> {
> public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, ?int
> $seed = null) {}
> //
> }
>
> class UserDefinedRandom extends Random
> {
> protected function next(): int
> {
> return 1;
>  }
> }
> ```
>
> After:
> ```php
> interface RandomNumberGenerator
> {
> public function generate(): int;
> }
>
>
> final class Random
> {
> public function __construct(string|RandomNumberGenerator $algo =
> RANDOM_XORSHIFT128PLUS, ?int $seed = null) {}
> }
> ```
>
> 2021年5月26日(水) 0:35 Go Kudo :
>
> > Hi, Thanks for the response.
> >
> > The RFC has been revised based on the points you pointed out.
> >
> > https://wiki.php.net/rfc/rng_extension
> >
> > The main changes are as follows:
> >
> > - Class name has been changed from `Randomizer` to `Random` .
> > - Added a static method `getNonBiasedMax()` to get the safe range.
> > - `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()`
> > for avoid future reserved words.
> > - `getInt()` arguments no longer accept null.
> > - `shuffle(array|string $target): array|string` has been separated into
> > `arrayShuffle(array $array)` and `stringShuffle(string $string): string`
> > for more comfortable static-analysis.
> > - fix: php_random_algo struct (included uint64_t -> int64_t)
> >
> > Answer a few questions:
> >
> > > When $seed is null, what is used for the seed value?
> >
> > Depends on the algo's implementation, but basically it is using internal
> > `php_random_int()`.
> > It is similar to `mt_srand()` on PHP 8.1.
> >
> >
> >
> https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13
> >
> > > Why cancelled RNG Extension?
> >
> > As a result of discussions during the draft, the functions became a
> single
> > class and no longer need to be separated. The functionality for random
> > numbers is now included in ext/standard and will conform to this. (e.g.
> > rand.c random.c)
> >
> > > Deprecation
> >
> > Dropped in the last RFC update.
> > These were premature and should not have been included with the RFCs that
> > add new features.
> >
> > If the direction seems generally okay, I'd like to start implementing it
> > to show more details.
> >
> > Regards,
> > Go Kudo
> >
> > 2021年5月23日(日) 5:56 Go Kudo :
> >
> >> Hi, Internals and all participated in the previous discussion.
> >>
> >> RFCs have been cleaned up and the proposal has been substantially
> changed.
> >>
> >> 

[PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-05-31 Thread Go Kudo
Hi Internals.

I apologize for the discussion outside the ML. Here's a brief history.

https://github.com/php/php-src/pull/7079

- To test the implementation, I sent a Pull-Request to php-src on GitHub.
- An excellent point was made, and we ended up having a discussion about it
on GitHub.
- I was going to change the current implementation RFC to Under Discussion,
but decided against it.

Also, sorry for the delay in answering your questions. I am not very good
at English and it is taking me a long time.

The current implementation and RFC:

- https://github.com/php/php-src/pull/7079
- https://wiki.php.net/rfc/rng_extension

However, the implementation of this user-defined class is very ugly and we
intend to improve it as follows.

Before:
```php

class Random
{
public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, ?int
$seed = null) {}
//
}

class UserDefinedRandom extends Random
{
protected function next(): int
{
return 1;
 }
}
```

After:
```php
interface RandomNumberGenerator
{
public function generate(): int;
}


final class Random
{
public function __construct(string|RandomNumberGenerator $algo =
RANDOM_XORSHIFT128PLUS, ?int $seed = null) {}
}
```

2021年5月26日(水) 0:35 Go Kudo :

> Hi, Thanks for the response.
>
> The RFC has been revised based on the points you pointed out.
>
> https://wiki.php.net/rfc/rng_extension
>
> The main changes are as follows:
>
> - Class name has been changed from `Randomizer` to `Random` .
> - Added a static method `getNonBiasedMax()` to get the safe range.
> - `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()`
> for avoid future reserved words.
> - `getInt()` arguments no longer accept null.
> - `shuffle(array|string $target): array|string` has been separated into
> `arrayShuffle(array $array)` and `stringShuffle(string $string): string`
> for more comfortable static-analysis.
> - fix: php_random_algo struct (included uint64_t -> int64_t)
>
> Answer a few questions:
>
> > When $seed is null, what is used for the seed value?
>
> Depends on the algo's implementation, but basically it is using internal
> `php_random_int()`.
> It is similar to `mt_srand()` on PHP 8.1.
>
>
> https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13
>
> > Why cancelled RNG Extension?
>
> As a result of discussions during the draft, the functions became a single
> class and no longer need to be separated. The functionality for random
> numbers is now included in ext/standard and will conform to this. (e.g.
> rand.c random.c)
>
> > Deprecation
>
> Dropped in the last RFC update.
> These were premature and should not have been included with the RFCs that
> add new features.
>
> If the direction seems generally okay, I'd like to start implementing it
> to show more details.
>
> Regards,
> Go Kudo
>
> 2021年5月23日(日) 5:56 Go Kudo :
>
>> Hi, Internals and all participated in the previous discussion.
>>
>> RFCs have been cleaned up and the proposal has been substantially changed.
>>
>> https://wiki.php.net/rfc/rng_extension
>>
>> First of all, I apologize for not checking out the implementation of
>> password_hash(), which is a precedent to learn from.
>>
>> I think I've answered all the questions I've been getting on Open Issues.
>> If I have left anything out, please let me know.
>>
>> Regards,
>> Go Kudo
>>
>


[PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-05-26 Thread Guilliam Xavier
On Tue, May 25, 2021 at 5:36 PM Go Kudo  wrote:

> Hi, Thanks for the response.
>
> The RFC has been revised based on the points you pointed out.
>
> https://wiki.php.net/rfc/rng_extension
>

Thanks, it looks like you have addressed all previous points (for me at
least). But also introduced a new one ;) with the new `static function
getNonBiasedMax(string $algo): int`...

(Note: I think some questions below could be answered by the list in
general, not only Go Kudo.)

Let's compare these two equivalent functions:

function f1(int $seed): void {
mt_srand($seed);
$a = mt_rand();
$b = mt_rand();
var_dump($a, $b);
}

function f2(int $seed): void {
$random = new Random(RANDOM_MT19937, $seed);
$max = Random::getNonBiasedMax(RANDOM_MT19937);
$a = $random->getInt(0, $max);
$b = $random->getInt(0, $max);
var_dump($a, $b);
}

In particular, note that we did *not* need to write the explicit/long
version of f1:

function f1_explicit(int $seed): void {
mt_srand($seed);
$max = mt_getrandmax();
$a = mt_rand(0, $max);
$b = mt_rand(0, $max);
var_dump($a, $b);
}

But what would happen with the implicit/short version of f2?

function f2_implicit(int $seed): void {
$random = new Random(RANDOM_MT19937, $seed);
$a = $random->getInt();
$b = $random->getInt();
var_dump($a, $b);
}

Would we get "biased" results (by the way, what does that mean exactly)?
like `mt_rand(PHP_INT_MIN, PHP_INT_MAX)`?
Couldn't the default min/max be made "safe"? or maybe "dynamic" depending
on the algo/implementation?

Also, let's consider this:

function g(Random $random): void {
$max = /* ??? */;
$a = $random->getInt(0, $max);
$b = $random->getInt(0, $max);
var_dump($a, $b);
}

Here, how to get the "non-biased max" for this Random instance (unknown
algo)?
Moreover, what would `Random::getNonBiasedMax(RANDOM_USER)` return? I think
we would rather want/need to call e.g.
`FixedNumberForTest::getNonBiasedMax()` (potentially overridden)?
Maybe you could add a (non-static) `function getAlgo(): string`, so we
could at least call `$random::getNonBiasedMax($random->getAlgo())`? (maybe
it could also be more generally useful, possibly along with a `getSeed()`,
akin to `password_get_info(string $hash)`?) or a non-static `function
getNonBiasedMax(): int`, and rename the static one? (or even drop it after
all? how often will we need it without having an instance? and if needed,
is `new Random($algo, 0)` a costly operation?) or some other solution
someone can think of?

Ah that made me think: should some methods better be `final`?

Finally, the current "Open Issues" section should probably renamed to
"Discussion" or even "FAQ" here?

Regards,

-- 
Guilliam Xavier


[PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-05-25 Thread Go Kudo
Hi, Thanks for the response.

The RFC has been revised based on the points you pointed out.

https://wiki.php.net/rfc/rng_extension

The main changes are as follows:

- Class name has been changed from `Randomizer` to `Random` .
- Added a static method `getNonBiasedMax()` to get the safe range.
- `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()`
for avoid future reserved words.
- `getInt()` arguments no longer accept null.
- `shuffle(array|string $target): array|string` has been separated into
`arrayShuffle(array $array)` and `stringShuffle(string $string): string`
for more comfortable static-analysis.
- fix: php_random_algo struct (included uint64_t -> int64_t)

Answer a few questions:

> When $seed is null, what is used for the seed value?

Depends on the algo's implementation, but basically it is using internal
`php_random_int()`.
It is similar to `mt_srand()` on PHP 8.1.

https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13

> Why cancelled RNG Extension?

As a result of discussions during the draft, the functions became a single
class and no longer need to be separated. The functionality for random
numbers is now included in ext/standard and will conform to this. (e.g.
rand.c random.c)

> Deprecation

Dropped in the last RFC update.
These were premature and should not have been included with the RFCs that
add new features.

If the direction seems generally okay, I'd like to start implementing it to
show more details.

Regards,
Go Kudo

2021年5月23日(日) 5:56 Go Kudo :

> Hi, Internals and all participated in the previous discussion.
>
> RFCs have been cleaned up and the proposal has been substantially changed.
>
> https://wiki.php.net/rfc/rng_extension
>
> First of all, I apologize for not checking out the implementation of
> password_hash(), which is a precedent to learn from.
>
> I think I've answered all the questions I've been getting on Open Issues.
> If I have left anything out, please let me know.
>
> Regards,
> Go Kudo
>


Re: [PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-05-24 Thread Kamil Tekiela
Hi Go Kudo,

This proposal looks much better. Well done.
What would help is if you showed the examples from the introduction section
rewritten using the new API.
I don't think it makes sense to make parameters (?int $min = PHP_INT_MIN,
?int $max = PHP_INT_MAX) nullable.
Regarding the method names, I would be afraid of naming them using PHP
keywords. Maybe they should be called getInt and getBytes?
Why isn't the new class in a new extension? Is there a reason why it should
go into core?
Regarding the name, I think Random or Randomizer is fine, but I would avoid
acronyms.

Regards,
Kamil


[PHP-DEV] Re: [RFC] [Draft] Add Randomizer class (before: Add RNG extension)

2021-05-24 Thread Guilliam Xavier
On Sat, May 22, 2021 at 10:57 PM Go Kudo  wrote:

> Hi, Internals and all participated in the previous discussion.
>
> RFCs have been cleaned up and the proposal has been substantially changed.
>
> https://wiki.php.net/rfc/rng_extension
>
> First of all, I apologize for not checking out the implementation of
> password_hash(), which is a precedent to learn from.
>
> I think I've answered all the questions I've been getting on Open Issues.
> If I have left anything out, please let me know.
>
> Regards,
> Go Kudo
>

Hi,

This really looks like a different proposal indeed (simpler and clearer),
thanks! A few (new) remarks/questions:

- Naming: "Randomizer", but some have suggested just "Random", or "RNG" (or
"PRNG").

- [I was about to say "If typical usage is expected to provide a seed (for
reproducibility) more often than choosing a non-default algorithm, maybe
the [optional] constructor parameters should be in the according order
($seed, $algo)?", but actually I'm not sure of "expected typical usage",
and the current order ($algo, $seed) seems more "logical" (e.g. I think
RANDOMIZER_SECURE will ignore $seed?), and we can use named arguments like
`new Randomizer(seed: 1234)` if needed, so...]

- Does `?int $seed = null` default to `time()` internally? or to something
else? [not sure if important...]

- Does `?int $min = PHP_INT_MIN` default to `PHP_INT_MIN` even if we pass
`null`? But, given that we can use named arguments, do $min/$max really
need to be nullable?

- About `shuffle(array|string $target): array|string`: I think just "value"
would be better than "target", and I'm not sure about the union type
(notably for static analysis)... Ideally it should be distinct `(array
$value): array` and `(string $value): string`, but that probably requires
two distinct names?

- For internal implementation, isn't there a signed/unsigned "mismatch"
between PHP `function next(): int` and C `uint64_t (*next)(void)` return
types?

Regards,

-- 
Guilliam Xavier