Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Larry Garfield
On Sun, Jan 3, 2021, at 2:25 PM, Marc wrote:

> >> You already provide a lookup mechanism with `MyEnum::from()` - I don't
> >> see a real use-case for proving a pre build map. The main use case I see
> >> is to list all possible enum values but this doesn't require a map and a
> >> zero-indexed-array would also be more performant with packed arrays
> >> (correct me if I'm wrong).
> > I do somewhat agree with you there. We're essentially returning
> > `Array|Map` which feels
> > inconsistent. When you're calling cases() you're most likely going to
> > loop over it at which point $case->value is available at your
> > disposal.
> 
> Would you consider making `cases()` returning a simple list in all cases 
> instead of differentiate between UnitEnum and ScalarEnum given the fact 
> that mostly people just want to loop over cases and a lookup is already 
> available with ScalarEnum::from() to provide a cleaner interface?
> 
> Marc


Ilija and I talked this one over a bit more, and decided that you're right.  
Between ->value and from() we couldn't come up with a use case that would need 
the assoc array that wouldn't work just as well with ->value, and it makes the 
method type definition simpler.

I've updated the RFC to have cases() always return a packed array; Ilija will 
update the PR soon.

Thanks for your feedback!

--Larry Garfield

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC] Add is_list(mixed $value): bool to check for list-like arrays

2021-01-03 Thread Larry Garfield
On Sun, Jan 3, 2021, at 12:17 PM, tyson andre wrote:
> Hi internals,
> 
> > I've created the RFC https://wiki.php.net/rfc/is_list
> > 
> > This adds a new function `is_list(mixed $value): bool` that will return true
> > if the type of $value is array and the array keys are `0 .. 
> > count($value)-1` in that order.
> >
> > It's well-known that PHP's `array` data type is rare among programming 
> > languages
> > in that it supports both integer and string keys
> > and that iteration order is important and guaranteed.
> > (it is used for overlapping use cases - in many other languages, both 
> > vectors/lists/arrays and hash maps are available)
> > 
> > While it is possible to efficiently check that something is an array,
> > that array may still have string keys, not start from 0, have missing array 
> > offsets,
> > or contain out of order keys.
> > 
> > It can be useful to verify that the assumption that array keys are 
> > consecutive integers is correct,
> > both for data that is being passed into a module or for validating data 
> > before returning it from a module.
> > However, because it's currently inconvenient to do that, this has rarely 
> > been done in my experience.
> > 
> > In performance-sensitive serializers or data encoders, it may also be 
> > useful to have an efficient check to distinguish lists from associative 
> > arrays.
> > For example, json_encode does this when deciding to serialize a value as 
> > [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
> > for arrays depending on the key orders.
> > 
> > Prior email threads/PRs have had others indicate interest in the ability to 
> > efficiently check
> > if a PHP `array` has sequential ordered keys starting from 0
> > 
> > https://externals.io/message/109760 “Any interest in a list type?”
> > https://externals.io/message/111744 “Request for couple memory optimized 
> > array improvements”
> > Implementation: https://github.com/php/php-src/pull/6070 (some discussion 
> > is in the linked PR it was based on)
> 
> Due to concerns about naming causing confusion with theoretical 
> potential future changes to the language,
> I've updated https://wiki.php.net/rfc/is_list to use the name 
> `is_array_and_list(mixed $value): bool` instead.
> (e.g. what if php used the reserved word `list` to add an actual list 
> type in the future, and is_list() returned false for that.)
> 
> I plan to start voting on the RFC in a few days.

Possible alternative that's less clumsy: is_packed_array?

--Larry Garfield

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Bundling ext/simdjson into core

2021-01-03 Thread Jakub Zelenka
Hi,

On Thu, Dec 31, 2020 at 1:30 AM Máté Kocsis  wrote:

> Hi Remi and Jakub,
>
>>
> I agree it's too early as the library is young and won't be available in
>>> many distros. The PECL path is better in this case IMO as it will allow
>>> some time .
>>
>>
> In my opinion, this is a case where making an
> exception is worth considering.
>
>
The PECL path doesn't mean that the extension won't be used. From the user
point of view, it doesn't really matter that much as distro packages are
available even for PECL extensions and it can easily added to Docker image
as well. The advantage is that it will give some time for the library to be
available in distros and possibly more stable. I think it would actually
help with stabilizing the API and introducing new features quickly. It
means all the mentioned features could be provided to users in the
extension release cycle and not waiting a year for new PHP version.


> Should the simdjson library be written in C,
> I'd propose to add the new API + parser to
> ext/json directly, since ext/simdjson is just a
> very small wrapper around the parser, and
> not a complex piece of code in itself (compared to other parts of php-src).
>
>
I think this wouldn't be really an option even if it was in C because
ext/json is enabled by default so you couldn't have external dependency on
such a new library. Otherwise it would mean that you couldn't install PHP
if that library is not available. Of course unless the library is bundled
but that is not a good idea for the maintanance.


> Also, I think the performance benefit of using
> the simdjson parser is so major that it would
> be a pity if people had to wait for years until
> the functionality becomes generally available
> as a core extension. As json_encode() and
> json_decode() are very easy to use, my guess
> is that a 3rd party JSON-related extension
> would never get an adoption large enough,
> because only those people would install it
> who have really reached the limitations of ext/json.
>
>
But this proposal is not about changing json_decode. It introduces a new
API that can be in the same way introduce in the PECL extension. As I said
above, having that in PECL doesn't mean that it's not available and people
have to wait for it.

It would be great to actually see what the performance benefit is in the
real applications. The benchmarks relies on repeated calls which is not
always the way how it's in the application (e.g. due to processor cache).
Also it might not be such a huge perf increase for the most apps as the
actual parsing is not usually the app bottleneck. That said I think it will
bring some considerable improvements anyway in the apps especially in those
doing lots of parsing but would be great to see how much it is in reality.


> By the way, it has just come to my mind that
> our company is also affected by these
> limitations. Sometimes we have to parse
> very large JSON documents, and in some
> cases these can end up being truncated.
> Fortunately we only need a specific part of
> the data, so someone wrote a partial "parser"
> (this is euphemism) tailored for the schema
> in question. Rather than having to use
> custom hackery, it would be so much better
> if PHP would offer partial parsing out of the
> box, like what the proposed
> JsonParser::getKeyValue() does.
>

As you mentiened, this could be possible in ondemand API which looks really
useful indeed. There are more things that are pretty useful like JSON
Pointer, better error reporting and UTF8 validation that could be
potentially also re-used in encoder. I think it would be great to have at
least some of the features in the extension before it gets to the core.
Especially thinking about the error reporting which should no longer depend
on global state.

One note about the proposed API. As it's not part of the ext/json, it
shouldn't be called JsonParser but rather SimdJsonParser to reflect that
it's part of the simdjson extension. That's the convention that is used for
other exts and it's also less confusing for users because that class won't
be available for many users initially - at least until the library is
available or extension installed / enabled. The methods also shouldn't be
all static but rather instance should be provided that would allow getting
errors or using the ondemand mode.


> That said, the cost-benefit ratio of having
> simdjson in core seems advantageous for me.
>
> Was thinking that it would be good to consider some kind of plugable
>> decoder where another extension could register a parsing callback.
>> Something similar to what we have for parser but instead for the whole
>> decoding. That would allow to still use current parser in json_decode but
>> if simdjson available / configured in ini, then it would used instead and
>> would be just faster. Not sure if all options are supported though - for
>> example don't see any note about UTF8 substitution
>> (JSON_INVALID_UTF8_SUBSTITUTE).
>>
>
> This is a very interesting 

Re: [PHP-DEV] [RFC] Configurable callback to dump results of expressions in `php -a`

2021-01-03 Thread tyson andre
Hi Rowan Tommins,

> > - The benefit is that dumping the result of expressions improves the 
> > default experience.
> >    psysh wouldn't be installed by default when a new developer is learning 
> >php through the php manual, or when sshed into a remote server.
> 
> It doesn't feel to me that you've really answered Nikita's question: if 
> all the code using these hooks is going to be distributed as userland 
> code anyway, then they're not going to improve the default experience.

I was saying that I'd planned to propose defaults if this passed.
__debugInfo() is a slight improvement, but the human-readable representation of 
an object isn't always the same thing as the debug representation of the object.
A human-readable representation might be `Point(x: 1, y: 2)`, where var_dump or 
var_export(`__set_state`)
is much longer, and I don't believe var_dump is a one size fits all solution 
for both simple and recursive data structures,
especially since `__debugInfo` predates the repl.

```
php > class Point { public function __construct(public int $x, public int $y) 
{} }
php > var_export(new Point(1, 2));
Point::__set_state(array(
   'x' => 1,
   'y' => 2,
))
```

> It feels like we need to go in one of two directions:
> a) Build a full-featured official REPL with all of these improvements 
> completely implemented out of the box. Limited extension hooks might 
> still be desirable to build custom versions for frameworks etc, but they 
> could be more targeted - for custom input, it could be "register 
> meta-command"; for custom output, we already have __debugInfo() at the 
> class level.

I'd be happy as long as we made progress on improving the interactive shell.

Psysh is 2.4MB as a compiled phar release and larger if distributed with 
library/application releases (e.g. on remote servers).
default extension hooks would likely be much smaller.

> b) Expose the magic behaviour needed for something like PsySh to do 
> everything `php -a` already can, and leave the rest to userland. So far, 
> the only mentioned requirement is a special form of eval() that swallows 
> fatal errors.

That may or may not be possible to do through creating a new `unsafe_eval` PECL 
(or only exposing it for interactive sessions) - I'd have to check.
Something like `unsafe_eval(string $code, array &$variables): mixed result` (or 
throw UnsafeFatalError)
The usual caveats about not using it in production would apply - the php 
compiler treats eval differently in that it has access to the caller's scope.

https://github.com/bobthecow/psysh/blob/master/src/ExecutionLoopClosure.php 
uses eval(),
but there's actually a lot of heuristics to avoid calling eval() on code with 
common known fatal errors.
(it uses the readline/libedit PHP module if it's available, but doesn't load 
readline_cli.c)

The ext/readline/readline_cli.c overrides `EG(bailout)` with zend_try macros to 
recover from fatal errors

```c
zend_try {
zend_eval_stringl(code, prompt_end - 
prompt_spec - 1, NULL, "php prompt code");
} zend_end_try();
```

> My feeling is that the current mood of the community favours (b) rather 
> than (a); the most obvious example is that PHP used to bundle a PEAR 
> executable, but Composer remains an entirely external project. Is there 
> a reason not to aim for the same "de facto standard" for a REPL?

Even if we do start distributing a better alternative shell with php or 
endorsing an alternative in docs,
it's still possible to incrementally improve `php -a`

It seems like a significant missing feature to omit printing expression results 
from `php -a`;
people who do use `php -a` may wish to continue using
that feature set (e.g. `#setting_name=value` to set/dump ini variables) but 
still benefit from minor improvements such as this RFC.
https://github.com/bobthecow/psysh/issues/462 mentions psysh isn't a drop-in 
replacement.

E.g. on shared hosting or when debugging on a remote host, it may be 
inconvenient to download psysh and faster to use `php -a`

https://www.php.net/manual/en/features.commandline.interactive.php only 
mentions `php -a`.
Perhaps it should mention external userland shells with a note that they're 
developed independently
if the REPL rarely receives updates in favor of other functionality.

- Someone learning from the php.net manual or a tutorial with minimal 
dependencies wouldn't install psysh right now.

Aside: the php manual doesn't mention composer except in a few PECL extension 
docs,
but people seem to figure out how to use composer out of necessity and library 
installation docs.
Interactive shells wouldn't be as advertised

Cheers,
-Tyson

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Marc

Hi Ilija,

On 03.01.21 12:54, Ilija Tovilo wrote:

Hi Marc


I don't have a really good use-case for float values. It just seems
weird to me that a ScalarEnum doesn't support all scalars.

Using the enum value as array key for `cases()` works with your current
proposal but if we later want to allow floats, bool whatever then we got
a food gun.

The main reason is that we're using a hashmap internally in from() to
find the given case you're looking for. This is the same hashmap PHP
arrays are based on which only supports ints/strings as keys. If we
were to allow any scalar as a value, looking up a case by value would
become a O(n) operation.w

We could do something terrible like serialize the key before storing
it in the hashmap to allow arbitrary key types. But that will require
serializing the value on each invocation of from() which will
unnecessarily slow down the 95% most common use cases (int/string) to
support the exception. Note though that it's always easier to extend
than to remove. By not offering this feature we're erring on the side
of caution.

That being said, I can see how ScalarEnum is a misleading name. We've
been thinking about a better name and only had some ideas we weren't
fully satisfied with. RawEnum, ValueEnum and ConvertibleEnum were some
of these ideas. Let us know if you have a better suggestion.


That's reasonable and makes sense at this point. Only supporting string 
and int is also fine (I don't have a personal use case for other types).


So using the same HT implementation as arrays internally totally makes 
sense but this is an implementation detail not visible for the outside 
and we shouldn't block outself for the future now as nobody knows of 
unknown possible use cases. At least if we can avoid it.




You already provide a lookup mechanism with `MyEnum::from()` - I don't
see a real use-case for proving a pre build map. The main use case I see
is to list all possible enum values but this doesn't require a map and a
zero-indexed-array would also be more performant with packed arrays
(correct me if I'm wrong).

I do somewhat agree with you there. We're essentially returning
`Array|Map` which feels
inconsistent. When you're calling cases() you're most likely going to
loop over it at which point $case->value is available at your
disposal.


Would you consider making `cases()` returning a simple list in all cases 
instead of differentiate between UnitEnum and ScalarEnum given the fact 
that mostly people just want to loop over cases and a lookup is already 
available with ScalarEnum::from() to provide a cleaner interface?


Marc


Ilija


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Configurable callback to dump results of expressions in `php -a`

2021-01-03 Thread Rowan Tommins

On 03/01/2021 16:11, tyson andre wrote:

- The benefit is that dumping the result of expressions improves the default 
experience.
   psysh wouldn't be installed by default when a new developer is learning php 
through the php manual, or when sshed into a remote server.



It doesn't feel to me that you've really answered Nikita's question: if 
all the code using these hooks is going to be distributed as userland 
code anyway, then they're not going to improve the default experience.


It feels like we need to go in one of two directions:

a) Build a full-featured official REPL with all of these improvements 
completely implemented out of the box. Limited extension hooks might 
still be desirable to build custom versions for frameworks etc, but they 
could be more targeted - for custom input, it could be "register 
meta-command"; for custom output, we already have __debugInfo() at the 
class level.


b) Expose the magic behaviour needed for something like PsySh to do 
everything `php -a` already can, and leave the rest to userland. So far, 
the only mentioned requirement is a special form of eval() that swallows 
fatal errors.


My feeling is that the current mood of the community favours (b) rather 
than (a); the most obvious example is that PHP used to bundle a PEAR 
executable, but Composer remains an entirely external project. Is there 
a reason not to aim for the same "de facto standard" for a REPL?


Regards,

--
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] Add is_list(mixed $value): bool to check for list-like arrays

2021-01-03 Thread tyson andre
Hi internals,

> I've created the RFC https://wiki.php.net/rfc/is_list
> 
> This adds a new function `is_list(mixed $value): bool` that will return true
> if the type of $value is array and the array keys are `0 .. count($value)-1` 
> in that order.
>
> It's well-known that PHP's `array` data type is rare among programming 
> languages
> in that it supports both integer and string keys
> and that iteration order is important and guaranteed.
> (it is used for overlapping use cases - in many other languages, both 
> vectors/lists/arrays and hash maps are available)
> 
> While it is possible to efficiently check that something is an array,
> that array may still have string keys, not start from 0, have missing array 
> offsets,
> or contain out of order keys.
> 
> It can be useful to verify that the assumption that array keys are 
> consecutive integers is correct,
> both for data that is being passed into a module or for validating data 
> before returning it from a module.
> However, because it's currently inconvenient to do that, this has rarely been 
> done in my experience.
> 
> In performance-sensitive serializers or data encoders, it may also be useful 
> to have an efficient check to distinguish lists from associative arrays.
> For example, json_encode does this when deciding to serialize a value as [0, 
> 1, 2] instead of {“0”:0,“2”:1,“1”:1}
> for arrays depending on the key orders.
> 
> Prior email threads/PRs have had others indicate interest in the ability to 
> efficiently check
> if a PHP `array` has sequential ordered keys starting from 0
> 
> https://externals.io/message/109760 “Any interest in a list type?”
> https://externals.io/message/111744 “Request for couple memory optimized 
> array improvements”
> Implementation: https://github.com/php/php-src/pull/6070 (some discussion is 
> in the linked PR it was based on)

Due to concerns about naming causing confusion with theoretical potential 
future changes to the language,
I've updated https://wiki.php.net/rfc/is_list to use the name 
`is_array_and_list(mixed $value): bool` instead.
(e.g. what if php used the reserved word `list` to add an actual list type in 
the future, and is_list() returned false for that.)

I plan to start voting on the RFC in a few days.

Thanks,
- Tyson
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Does PHP ever stack allocate?

2021-01-03 Thread Sara Golemon
On Sun, Jan 3, 2021 at 10:37 AM Olle Härstedt 
wrote:
> Thanks Sara! I realize I should have been more precise: Can PHP
> allocate non-reference counted memory that automatically is freed when
> leaving scope, similar to what Go does with escape analysis?
>

It seems like you're conflating userspace variables and their underlying
data storage mechanisms and while the two are related and have naturally
similar semantics, they're not really the same thing and trying to
generically talk about both at the same time is only going to add to
confusion.

If you're talking about userspace variables... Yeah.  They're scope bound
and destructed/freed on scope exit.  Nearly every language does some form
of this because to not do so would be leaky and awful.

If you're talking about internal memory allocators, then the idea of
"automatically freed" is a red-herring.  It's an illusion created by some
languages to cover up the quite explicit mechanisms in place to handle data
lifetimes.  PHP does its form of this as well using a combination of stack
and heap allocators as appropriate.

-Sara


Re: [PHP-DEV] Does PHP ever stack allocate?

2021-01-03 Thread tyson andre
Hi Olle,

> Thanks Sara! I realize I should have been more precise: Can PHP
> allocate non-reference counted memory that automatically is freed when
> leaving scope, similar to what Go does with escape analysis?
>
> Article describing the Go mechanism:
> https://segment.com/blog/allocation-efficiency-in-high-performance-go-services/

Could you give some concrete examples of what type of code you're talking about?
As Sara Golemon said, scalars (null, bool, int, float) are allocated on a php 
call frame,
and the call frames go on a stack. That stack is separate from the C stack, but 
still a stack

The call frame is "freed" when leaving scope - i.e. that part of the stack will 
be reused on subsequent calls.

> A single PHP call frame holds a block of storage space for (among other
> things) all* local variables.  This can be thought of analogously to "the
> stack" as it's used by native applications.  Basic scalars (null, bool,
> int, float) sit in this space with no additional pointers to anywhere.
> Non-scalars use pointers to elsewhere in the heap to store the actual
> payload.  This isn't unique to PHP, as these structures have runtime
> determined size and thus can't** be stack allocated.

https://nikic.github.io/2017/04/14/PHP-7-Virtual-machine.html may help if you 
want to learn more about what the PHP VM currently does

> So what’s the difference between TMP and VAR? Not much.
> The distinction was inherited from PHP 5, where TMPs were VM stack allocated,
> while VARs were heap allocated. In PHP 7 all variables are stack allocated.
> As such, nowadays the main difference between TMPs and VARs is that only the 
> latter are allowed to contain REFERENCEs
> (this allows us to elide DEREFs on TMPs). Furthermore VARs may hold two types 
> of special values,
> namely class entries and INDIRECT values. The latter are used to handle 
> non-trivial assignments.

-Tyson
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Analysis of property visibility, immutability, and cloning proposals

2021-01-03 Thread Olle Härstedt
2021-01-03 16:55 GMT, Larry Garfield :
> On Sun, Jan 3, 2021, at 8:28 AM, Olle Härstedt wrote:
>
>> >> I like that you connect higher level design patterns with language
>> >> design. This is the way to go, IMO. Personally, I'd prefer support for
>> >> the Psalm notation `@psalm-readonly`, which is the same as your
>> >> initonly. Clone-with makes sense too, as this construct is already
>> >> supported in multiple languages. The exact notation doesn't matter
>> >> that much - my personal choice is OCaml {record with x = 10} over JS
>> >> spread operator, but OCaml is pretty "wordy" in notation in contrast
>> >> to the C tradition that PHP is part of.
>> >>
>> >> Reintroducing "objects that pass by value" is a hard pass from me. The
>> >> way forward is immutability and constrained mutability (ownership,
>> >> escape analysis, etc). Psalm also supports array shapes - maybe this
>> >> can be investigated as an alternative? Since PHP has no tuples.
>> >>
>> >> I'm not convinced the added complexity of asymmetric visibility is
>> >> powerful enough to motivate its existence. Feel free to prove me
>> >> wrong. :) My choice here would be namespace "internal" (also supported
>> >> by Psalm already), but this requires implementation of namespace
>> >> visibility, a PR that was abandoned.
>> >>
>> >> And also, happy new year!
>> >
>> > Happy New Year!
>> >
>> > I agree that "objects, but passing by value" would not be the right
>> > solution. I used to think that would be a good part of the solution,
>> > but
>> > eventually concluded that it would introduce more complexity, not less.
>> > Eventually, everything people wanted to do with objects they'd want to
>> > do
>> > with "Records" (for lack of a better term), and if they pass by value
>> > but
>> > are still mutable then you have a weird situation where sometimes
>> > changes
>> > propagate and some don't (depending on if you have a record or object).
>> > Making it easier to use objects in a value-esque way will get us closer
>> > to
>> > the desired end state.
>> >
>> > I think the tldr of my post is this: A single "immutable" flag
>> > (whatever
>> > it's called) on a class or property would require having lots of holes
>> > poked
>> > in it in order to make it useful in practice (mostly what "initonly"
>> > would
>> > do), but those holes would introduce other holes we don't want (cloning
>> > an
>> > object from the outside when you shouldn't).
>>
>> I new language feature needs to be both simple and powerful - it's not
>> enough to be only powerful. A second problem I see is how asymmetric
>> visibility would affect the readability of a class, putting extra
>> strain in understanding it. Thirdly, how does PHP differ from FP
>> languages like OCaml and Haskell in this regard, neither who uses
>> visibility in this way? What's acceptable in those languages that
>> would be unacceptable in PHP?
>>
>> Olle
>
> I'll disagree slightly.  A language feature should introduce more power than
> it does complexity.  Not everything *can* be made absolutely simple, but the
> power it offers is worth it.  I'd say it should minimize introduced
> complexity, relative to the power offered.  Complexity ideally is super low,
> but it's never zero simply by virtue of being "one more thing" that
> developers need to know how to read.
>
> So in this case, we need to compare the power/complexity of asymmetric
> visibility vs the power/complexity of "immutable... except in these
> situations."  I would argue that asymmetric visibility is more
> self-documenting, because it states explicitly what those situations are.
>
> The other point is that, as noted, "initonly" creates a gap if you have
> properties that are inter-dependent.  Those then cannot be made public-read,
> because that would also mean public-clone-with, and thus allow callers to
> violate property relationships.  Asymmetric visibility does not have that
> problem.

Can you perhaps be a bit more clear on why initonly/readonly would be
a deal breaker? Seems to me like readonly would cover 80% of
use-cases? Which is to make data-value objects humane (and fast, since
you don't need getters anymore) to work with. Seems like you're
focusing too much on an edge case here. Maybe we should list the
possibly use-cases? Or at least the main target use-case.

If an object has invariants that need to hold, just throw an exception
in __clone to force use with withX() instead? Or, as you suggested,
improve __clone by giving it arguments?

Olle

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Analysis of property visibility, immutability, and cloning proposals

2021-01-03 Thread Larry Garfield
On Sun, Jan 3, 2021, at 8:28 AM, Olle Härstedt wrote:

> >> I like that you connect higher level design patterns with language
> >> design. This is the way to go, IMO. Personally, I'd prefer support for
> >> the Psalm notation `@psalm-readonly`, which is the same as your
> >> initonly. Clone-with makes sense too, as this construct is already
> >> supported in multiple languages. The exact notation doesn't matter
> >> that much - my personal choice is OCaml {record with x = 10} over JS
> >> spread operator, but OCaml is pretty "wordy" in notation in contrast
> >> to the C tradition that PHP is part of.
> >>
> >> Reintroducing "objects that pass by value" is a hard pass from me. The
> >> way forward is immutability and constrained mutability (ownership,
> >> escape analysis, etc). Psalm also supports array shapes - maybe this
> >> can be investigated as an alternative? Since PHP has no tuples.
> >>
> >> I'm not convinced the added complexity of asymmetric visibility is
> >> powerful enough to motivate its existence. Feel free to prove me
> >> wrong. :) My choice here would be namespace "internal" (also supported
> >> by Psalm already), but this requires implementation of namespace
> >> visibility, a PR that was abandoned.
> >>
> >> And also, happy new year!
> >
> > Happy New Year!
> >
> > I agree that "objects, but passing by value" would not be the right
> > solution. I used to think that would be a good part of the solution, but
> > eventually concluded that it would introduce more complexity, not less.
> > Eventually, everything people wanted to do with objects they'd want to do
> > with "Records" (for lack of a better term), and if they pass by value but
> > are still mutable then you have a weird situation where sometimes changes
> > propagate and some don't (depending on if you have a record or object).
> > Making it easier to use objects in a value-esque way will get us closer to
> > the desired end state.
> >
> > I think the tldr of my post is this: A single "immutable" flag (whatever
> > it's called) on a class or property would require having lots of holes poked
> > in it in order to make it useful in practice (mostly what "initonly" would
> > do), but those holes would introduce other holes we don't want (cloning an
> > object from the outside when you shouldn't).
> 
> I new language feature needs to be both simple and powerful - it's not
> enough to be only powerful. A second problem I see is how asymmetric
> visibility would affect the readability of a class, putting extra
> strain in understanding it. Thirdly, how does PHP differ from FP
> languages like OCaml and Haskell in this regard, neither who uses
> visibility in this way? What's acceptable in those languages that
> would be unacceptable in PHP?
> 
> Olle

I'll disagree slightly.  A language feature should introduce more power than it 
does complexity.  Not everything *can* be made absolutely simple, but the power 
it offers is worth it.  I'd say it should minimize introduced complexity, 
relative to the power offered.  Complexity ideally is super low, but it's never 
zero simply by virtue of being "one more thing" that developers need to know 
how to read.

So in this case, we need to compare the power/complexity of asymmetric 
visibility vs the power/complexity of "immutable... except in these 
situations."  I would argue that asymmetric visibility is more 
self-documenting, because it states explicitly what those situations are.

The other point is that, as noted, "initonly" creates a gap if you have 
properties that are inter-dependent.  Those then cannot be made public-read, 
because that would also mean public-clone-with, and thus allow callers to 
violate property relationships.  Asymmetric visibility does not have that 
problem.

As far as other language comparisons, I've never written in OCaml and can only 
barely read Haskell. :-)  However, the relevant points as I understand them are:

* In strictly functional languages (Haskell, etc.), immutability is assumed by 
default.  So the rest of the syntax, runtime behavior, and community standards 
are built on that assumption.  That's not true in PHP.

* Haskell at least (and I presume other strictly functional languages, although 
I've not dug into them in any detail at all) know you're going to be calling a 
bazillion functions, often recursively, and so the engine can reorder things, 
execute lazily, skip having a stack entirely, or do other things to make a 
deeply recursive function design highly performant.  That's not the case in 
PHP, so usually an iterative algorithm is going to be more performant but 
requires mutating variables.  So the engine is optimized for that by default.

Compare the idealized functional/immutable fibbonaci with its mutable-iterative 
version:

function fp_fib(int $n) {
  return match($n) {
0, 1 => 1,
 default => fp_fib(n-1) - fp_fib(n-2),
  };
}

function fibonacci_iterative(int $n)
{
$previous = 1;
$current = 1;
$next = 1;
   

Re: [PHP-DEV] Does PHP ever stack allocate?

2021-01-03 Thread Olle Härstedt
2021-01-03 16:15 GMT, Sara Golemon :
> On Fri, Jan 1, 2021 at 3:18 PM Olle Härstedt 
> wrote:
>
>> Or is everything reference counted with heap allocation? Since PHP has
>> escape analysis, this could be used to use the stack instead, and kill
>> the
>> memory when the scope ends. If PHP uses the stack, can this be seen in
>> the
>> opcode?
>>
>>
> Well, you're not going to like this answer, but yes and no.
>
> A single PHP call frame holds a block of storage space for (among other
> things) all* local variables.  This can be thought of analogously to "the
> stack" as it's used by native applications.  Basic scalars (null, bool,
> int, float) sit in this space with no additional pointers to anywhere.
> Non-scalars use pointers to elsewhere in the heap to store the actual
> payload.  This isn't unique to PHP, as these structures have runtime
> determined size and thus can't** be stack allocated.
>
> There's further asterii below all of those statements, but that's the
> high-level generalized answer to your question as posed.
>
> The implied question you asked is actually handled using another
> mechanism.  PHP's internals have two separate memory allocation pools.
> "Persistent" memory allocation, which creates blocks of memory for the
> lifetime of the process, and "Engine" memory allocation, which are bulk
> de-allocated*** at the end of every request (after relevant
> destructors have fired).
>
> -Sara
>
> * All variables which are explicitly references as $foo (excluding
> auto-globals).  ${'bar'} and $$baz style references to locals are special
> and require their own separate conversation.
> ** C's alloca() and similar techniques in other languages can reserve
> dynamic amounts of stack space, but let's ignore that power-move for the
> sake of this argument.
> *** Deallocated from the request handler's point of view, though the
> runtime's memory manager (usually) doesn't give it back to the OS
> immediately, since it's likely to be used by the next request anyway.
>

Thanks Sara! I realize I should have been more precise: Can PHP
allocate non-reference counted memory that automatically is freed when
leaving scope, similar to what Go does with escape analysis?

Article describing the Go mechanism:
https://segment.com/blog/allocation-efficiency-in-high-performance-go-services/

Olle

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Does PHP ever stack allocate?

2021-01-03 Thread Sara Golemon
On Fri, Jan 1, 2021 at 3:18 PM Olle Härstedt  wrote:

> Or is everything reference counted with heap allocation? Since PHP has
> escape analysis, this could be used to use the stack instead, and kill the
> memory when the scope ends. If PHP uses the stack, can this be seen in the
> opcode?
>
>
Well, you're not going to like this answer, but yes and no.

A single PHP call frame holds a block of storage space for (among other
things) all* local variables.  This can be thought of analogously to "the
stack" as it's used by native applications.  Basic scalars (null, bool,
int, float) sit in this space with no additional pointers to anywhere.
Non-scalars use pointers to elsewhere in the heap to store the actual
payload.  This isn't unique to PHP, as these structures have runtime
determined size and thus can't** be stack allocated.

There's further asterii below all of those statements, but that's the
high-level generalized answer to your question as posed.

The implied question you asked is actually handled using another
mechanism.  PHP's internals have two separate memory allocation pools.
"Persistent" memory allocation, which creates blocks of memory for the
lifetime of the process, and "Engine" memory allocation, which are bulk
de-allocated*** at the end of every request (after relevant
destructors have fired).

-Sara

* All variables which are explicitly references as $foo (excluding
auto-globals).  ${'bar'} and $$baz style references to locals are special
and require their own separate conversation.
** C's alloca() and similar techniques in other languages can reserve
dynamic amounts of stack space, but let's ignore that power-move for the
sake of this argument.
*** Deallocated from the request handler's point of view, though the
runtime's memory manager (usually) doesn't give it back to the OS
immediately, since it's likely to be used by the next request anyway.


Re: [PHP-DEV] [RFC] Configurable callback to dump results of expressions in `php -a`

2021-01-03 Thread tyson andre
> Reading through the linked earlier mail, you have quite a few additional 
> hooks in mind, which might need a significant amount of additional userland 
> code (such as a parser implementation) to usefully implement. At that point 
> I'm wondering what the benefit of this hybrid approach is, relatively to a 
> userland implementation like psysh.
>
> That is: 
> a) Assuming all the hooks have been implemented, what additional 
> functionality is the interactive shell implementation itself actually 
> providing?
> b) Is it possible to go the other way around, and expose that additional 
> functionality to userland instead? You do mention fatal error tolerance as 
> one distinguishing feature -- is there anything beyond that?

For https://wiki.php.net/rfc/readline_interactive_shell_result_function in 
particular,
compared to `php -a` in php 8.0, a subsequent RFC would provide additional 
functionality using these hooks: it would add default hooks to print a short 
representation of results of expressions to the interactive shell 
implementation (that could be turned off or replaced to dump objects)

- The benefit is that dumping the result of expressions improves the default 
experience.
  psysh wouldn't be installed by default when a new developer is learning php 
through the php manual, or when sshed into a remote server.
- The reason to allow hooking it is because some objects wouldn't have a 
user-friendly representation for var_dump/var_export (e.g. recursive data 
structures)
- Even if the code size is large, it may be doable by making the userland 
implementation part of ext/phpi/ (--enable-phpi), which is disabled by default 
and can be installed in separate packages created by operating system package 
maintainers (and shrunk by using minifiers)

Compared to psysh, the main distinguishing feature is definitely the ability to 
detect/tolerate fatal errors when compiling snippets or inheriting classes, and 
fewer dependencies to include to integrate an interactive shell with utilities 
for a project.
I don't think it should be exposed to regular processes or web servers, though, 
due to possible memory corruption or leaks after zend_error_noreturn (e.g. 
class inheritance errors after autoloading), etc.).

- It would possibly be an improvement to throw an error instead of causing a 
fatal error for common mistakes in interactive shell sessions such as duplicate 
functions/parameters but I'm not sure how likely that is, especially since 
classes and functions currently get added as the file is being compiled.

Integrating userland shells like `psysh` deeply into `php -a` may wish to avoid 
readline entirely and call a callback instead of printing `php>` and directly 
processing input like those projects already do.
Two hooks may help with enabling that approach, which can be added in 
`auto_prepend_file`

1. A hook to call a callback instead of printing "php >" and C readline reading 
stdin.
e.g. `readline_replace_interactive_shell_initializer(function () { ... read 
and process stdin in a loop })`
2. Adding a hook to call a function every time an uncatchable fatal error was 
encountered, e.g. to resume the userland shell.
e.g. `readline_replace_interactive_fatal_error_handler(function ($errcode, 
$errmsg, $file, $line, $errcount): bool { /* process or exit */ })`

Thanks,
- Tyson
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Analysis of property visibility, immutability, and cloning proposals

2021-01-03 Thread Olle Härstedt
2021-01-02 16:06 GMT, Larry Garfield :
> On Fri, Jan 1, 2021, at 5:51 PM, Olle Härstedt wrote:
>
>> >> The web dev discourse is
>> >> one-sided with regard to immutability,
>> >
>> > Yes, if you've heard any of the regular whining about PSR-7 being an
>> > immutable object you'd think it's one-sided in favor of mutability. ;-)
>> >
>> > As you say, the point here is to add tools.  Right now, doing
>> > immutability
>> > in PHP in syntactically clumsy and ugly.  We want to fix that, and that
>> > has
>> > to include some means of "give me a new value based on this existing
>> > value
>> > but with some difference."  (aka, exactly what with-er methods do,
>> > although
>> > I agree entirely that if you have the option of less generic names, use
>> > them).
>> >
>> > So, can we get back to the original post, which is proposing specifics
>> > of
>> > the tools to make that happen? :-)  (Asymmetric visibility and
>> > clone-with,
>> > specifically.)
>> >
>>
>> OK!
>>
>> I like that you connect higher level design patterns with language
>> design. This is the way to go, IMO. Personally, I'd prefer support for
>> the Psalm notation `@psalm-readonly`, which is the same as your
>> initonly. Clone-with makes sense too, as this construct is already
>> supported in multiple languages. The exact notation doesn't matter
>> that much - my personal choice is OCaml {record with x = 10} over JS
>> spread operator, but OCaml is pretty "wordy" in notation in contrast
>> to the C tradition that PHP is part of.
>>
>> Reintroducing "objects that pass by value" is a hard pass from me. The
>> way forward is immutability and constrained mutability (ownership,
>> escape analysis, etc). Psalm also supports array shapes - maybe this
>> can be investigated as an alternative? Since PHP has no tuples.
>>
>> I'm not convinced the added complexity of asymmetric visibility is
>> powerful enough to motivate its existence. Feel free to prove me
>> wrong. :) My choice here would be namespace "internal" (also supported
>> by Psalm already), but this requires implementation of namespace
>> visibility, a PR that was abandoned.
>>
>> And also, happy new year!
>
> Happy New Year!
>
> I agree that "objects, but passing by value" would not be the right
> solution. I used to think that would be a good part of the solution, but
> eventually concluded that it would introduce more complexity, not less.
> Eventually, everything people wanted to do with objects they'd want to do
> with "Records" (for lack of a better term), and if they pass by value but
> are still mutable then you have a weird situation where sometimes changes
> propagate and some don't (depending on if you have a record or object).
> Making it easier to use objects in a value-esque way will get us closer to
> the desired end state.
>
> I think the tldr of my post is this: A single "immutable" flag (whatever
> it's called) on a class or property would require having lots of holes poked
> in it in order to make it useful in practice (mostly what "initonly" would
> do), but those holes would introduce other holes we don't want (cloning an
> object from the outside when you shouldn't).

I new language feature needs to be both simple and powerful - it's not
enough to be only powerful. A second problem I see is how asymmetric
visibility would affect the readability of a class, putting extra
strain in understanding it. Thirdly, how does PHP differ from FP
languages like OCaml and Haskell in this regard, neither who uses
visibility in this way? What's acceptable in those languages that
would be unacceptable in PHP?

Olle

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Ilija Tovilo
Hi Marc

> I don't have a really good use-case for float values. It just seems
> weird to me that a ScalarEnum doesn't support all scalars.
>
> Using the enum value as array key for `cases()` works with your current
> proposal but if we later want to allow floats, bool whatever then we got
> a food gun.

The main reason is that we're using a hashmap internally in from() to
find the given case you're looking for. This is the same hashmap PHP
arrays are based on which only supports ints/strings as keys. If we
were to allow any scalar as a value, looking up a case by value would
become a O(n) operation.

We could do something terrible like serialize the key before storing
it in the hashmap to allow arbitrary key types. But that will require
serializing the value on each invocation of from() which will
unnecessarily slow down the 95% most common use cases (int/string) to
support the exception. Note though that it's always easier to extend
than to remove. By not offering this feature we're erring on the side
of caution.

That being said, I can see how ScalarEnum is a misleading name. We've
been thinking about a better name and only had some ideas we weren't
fully satisfied with. RawEnum, ValueEnum and ConvertibleEnum were some
of these ideas. Let us know if you have a better suggestion.

> You already provide a lookup mechanism with `MyEnum::from()` - I don't
> see a real use-case for proving a pre build map. The main use case I see
> is to list all possible enum values but this doesn't require a map and a
> zero-indexed-array would also be more performant with packed arrays
> (correct me if I'm wrong).

I do somewhat agree with you there. We're essentially returning
`Array|Map` which feels
inconsistent. When you're calling cases() you're most likely going to
loop over it at which point $case->value is available at your
disposal.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Analysis of property visibility, immutability, and cloning proposals

2021-01-03 Thread Rowan Tommins
On 2 January 2021 21:25:08 GMT+00:00, Larry Garfield  
wrote:
>If a stream is not seekable, then it would have to consume and destroy
>$fp in the process (unset it).  So:
>
>[$line1, $fp2] = read_line($fp);
>[$line2, $fp2] = read_line($fp);
>
>The second line would throw an error that $fp "has been consumed" or
>something like that.  But even that still creates potential for
>spooky-action-at-a-distance if $fp was passed into a function, gets
>read in that function, and then the parent call scope has a broken $fp
>lying around.


Yes, that is where "uniqueness attributes" come in: in Clean, that's basically 
how I/O looks, but either of those scenarios would produce an error *at compile 
time*. The type system includes the constraint that the file handle must not be 
reachable from anywhere else when passed to the read_line function, whether 
that's use of the same variable after the call, assignment to an extra 
variable, capture by some other function, or storage in an array or record.

The same constraint can be added to custom functions, allowing the compiler to 
reuse the memory for, say, a large array that you're adding an item to. So you 
still write the code as though it was immutable, and can reason about it that 
way, but can also prove that it's safe to actually mutate it in place.

Similar things can be done, in a slightly different way, with Rust's 
ownership/lifetime system: the "borrow checker" proves that the manipulations 
you're doing are free of "action at a distance" by prohibiting anything that 
would create ambiguous "ownership".


Regards,

-- 
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Marc

On 03.01.21 11:56, Marc wrote:
> On 29.12.20 16:42, Larry Garfield wrote:
>> On Tue, Dec 29, 2020, at 2:48 AM, Marc wrote:
>>> On 28.12.20 21:21, Larry Garfield wrote:
 Hello, Internalians!

 After considerable discussion and effort, Ilija and I are ready to offer 
 you round 2 on enumerations.  This is in the spirit of the previous 
 discussion, but based on that discussion a great deal has been reworked.  
 The main change is that Enumeration Cases are now object instances of the 
 Enumeration class rather than their own class.  Most of the other changes 
 are knock-on effects of that.

 Of particular note:

 * Cases may not have methods or constants on them.  They're just dumb 
 values.
 * Enums themselves may have methods, static methods, or constants.
 * Traits are supported, as long as they don't have properties.
 * The value() method on scalar enums is now a property.

 The full RFC is here, and I recommend reading it again in full given how 
 much was updated.
>>> I did and the RFC looks really awesome :+1:
>>>
>>> I don't have time to test the implementation but I noticed one thing:
>>>
 If the enumeration is not a Scalar Enum, the array will be packed
>>> (indexed sequentially starting from 0). If the enumeration is a Scalar
>>> Enum, the keys will be the corresponding scalar for each enumeration.
>>>
>>> I don't think using the scalar values as keys is a good idea. What
>>> happens if we want to support scalar float values? (Why are they
>>> actually not supported in the first place?)
>> That's why floats are not supported, in fact, because what happens to them 
>> when they are made into an array key is non-obvious.  (PHP would say to 
>> convert to a string, but that's always fussy with possible data loss in some 
>> cases, etc.)  We decided to just avoid that problem until/unless someone 
>> found a good use case for float enums.  No all languages support them as is, 
>> so there is precedent.
>>
>>> Also I think it's more natural if both enum types return a
>>> zero-indexed-array of cases.
>> The goal is to make it easy to work with them, and having a clean lookup map 
>> readily available is very convenient.  If you don't care about the scalar 
>> equivalent then you can safely ignore them.  If you do want them, then you 
>> have a lookup table ready-made for you.  That's the logic we were working 
>> from.
> I don't have a really good use-case for float values. It just seems
> weird to me that a ScalarEnum doesn't support all scalars.
>
> Using the enum value as array key for `cases()` works with your current
> proposal but if we later want to allow floats, bool whatever then we got
> a food gun.

Forgot to mention on (virtiually) adding generics to the game the method
`cases(): array;`would be described as
`cases(): array;` on UnitEnum but `cases():
array;` on ScalarEnum which is not compatible for
reasons and I think (even if not yet possible with PHP) such things
needs to be considered on producing clean interfaces.


>
> You already provide a lookup mechanism with `MyEnum::from()` - I don't
> see a real use-case for proving a pre build map. The main use case I see
> is to list all possible enum values but this doesn't require a map and a
> zero-indexed-array would also be more performant with packed arrays
> (correct me if I'm wrong).
>

Thanks,

Marc


>> --Larry Garfield
>>


Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Mike Schinkel



> On Dec 31, 2020, at 12:15 PM, Larry Garfield  wrote:
> 
> On Thu, Dec 31, 2020, at 6:53 AM, Rowan Tommins wrote:
>> On 30/12/2020 21:24, Aleksander Machniak wrote:
>>> My argument is that, from an end-user perspective, I don't really see
>>> why Unit and Scalar enums have to have different "API" at this point.
>>> I'm talking about ":string"/":int" in the enum definiton as well as
>>> ->value and ->from().
>> 
>> 
>> My personal opinion is that for many enums, explicitly not having a 
>> scalar representation is a good thing.
>> 
>> This is basically similar to my opinion of __toString() etc: if you have 
>> *multiple* ways to convert something to/from a scalar, blessing one of 
>> them as "default" is arbitrary and confusing.
>> 
>> For example:
>> 
>> enum BookingStatus {
>>  case PENDING;
>>  case CONFIRMED;
>>  case CANCELLED;
>> 
>>  public function getId() {
>>return match($this) {
>> self::PENDING => 1,
>> self::CONFIRMED => 2,
>> self::CANCELLED => 3,
>>};
>>  }
>>  public function getCode() {
>>return match($this) {
>> self::PENDING => 'PEN',
>> self::CONFIRMED => 'CON',
>> self::CANCELLED => 'CAN',
>>};
>>  }
>>  public function getEnglishDescription() {
>>return match($this) {
>> self::PENDING => 'Pending Payment',
>> self::CONFIRMED => 'Confirmed',
>> self::CANCELLED => 'Cancelled',
>>};
>>  }
>> }
> 
> That is similar to our reasoning.  It creates a foot-gun situation where 
> someone could get in the habit of assuming that an enum always has a 
> *reasonable* and *logical* and thus *reliable* string equivalent, when not 
> all enums will have string equivalents that it's reasonable and logical to 
> use.  So, one less foot gun.

However, avoiding one foot-gun does not always mean that you avoid all 
foot-guns.

For example, when you need to create a large number of string enums where the 
name and the symbol are the same, it would be very easy to have a typo in one 
but not the other, especially as a result of copy and paste editing.

So in my perfect world this:

enum BookingStatus {
 case PENDING;
 case CONFIRMED;
 case CANCELLED;
}

Would be equivalent to:

enum BookingStatus {
 case PENDING = "PENDING";
 case CONFIRMED = "CONFIRMED";
 case CANCELLED = "CANCELLED";
}

#fwiw

> 
> Also, one of the extensions planned, as noted, is ADTs/tagged unions.  Those 
> could not have a primitive equivalent, since they're not singletons.  Keeping 
> UnitEnum and ScalarEnum separate allows us to later add TaggedEnum (or 
> similar) that also extends UnitEnum, but not ScalarEnum.
> 
> --Larry Garfield
> 
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
> 

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Enumerations, Round 2

2021-01-03 Thread Marc


On 29.12.20 16:42, Larry Garfield wrote:
> On Tue, Dec 29, 2020, at 2:48 AM, Marc wrote:
>> On 28.12.20 21:21, Larry Garfield wrote:
>>> Hello, Internalians!
>>>
>>> After considerable discussion and effort, Ilija and I are ready to offer 
>>> you round 2 on enumerations.  This is in the spirit of the previous 
>>> discussion, but based on that discussion a great deal has been reworked.  
>>> The main change is that Enumeration Cases are now object instances of the 
>>> Enumeration class rather than their own class.  Most of the other changes 
>>> are knock-on effects of that.
>>>
>>> Of particular note:
>>>
>>> * Cases may not have methods or constants on them.  They're just dumb 
>>> values.
>>> * Enums themselves may have methods, static methods, or constants.
>>> * Traits are supported, as long as they don't have properties.
>>> * The value() method on scalar enums is now a property.
>>>
>>> The full RFC is here, and I recommend reading it again in full given how 
>>> much was updated.
>> I did and the RFC looks really awesome :+1:
>>
>> I don't have time to test the implementation but I noticed one thing:
>>
>>> If the enumeration is not a Scalar Enum, the array will be packed
>> (indexed sequentially starting from 0). If the enumeration is a Scalar
>> Enum, the keys will be the corresponding scalar for each enumeration.
>>
>> I don't think using the scalar values as keys is a good idea. What
>> happens if we want to support scalar float values? (Why are they
>> actually not supported in the first place?)
> That's why floats are not supported, in fact, because what happens to them 
> when they are made into an array key is non-obvious.  (PHP would say to 
> convert to a string, but that's always fussy with possible data loss in some 
> cases, etc.)  We decided to just avoid that problem until/unless someone 
> found a good use case for float enums.  No all languages support them as is, 
> so there is precedent.
>
>> Also I think it's more natural if both enum types return a
>> zero-indexed-array of cases.
> The goal is to make it easy to work with them, and having a clean lookup map 
> readily available is very convenient.  If you don't care about the scalar 
> equivalent then you can safely ignore them.  If you do want them, then you 
> have a lookup table ready-made for you.  That's the logic we were working 
> from.

I don't have a really good use-case for float values. It just seems
weird to me that a ScalarEnum doesn't support all scalars.

Using the enum value as array key for `cases()` works with your current
proposal but if we later want to allow floats, bool whatever then we got
a food gun.

You already provide a lookup mechanism with `MyEnum::from()` - I don't
see a real use-case for proving a pre build map. The main use case I see
is to list all possible enum values but this doesn't require a map and a
zero-indexed-array would also be more performant with packed arrays
(correct me if I'm wrong).


>
> --Larry Garfield
>

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving PRNG implementation.

2021-01-03 Thread Marc


On 01.01.21 20:22, Go Kudo wrote:
> Hi Marc, and sorry for the late reply. Email has been marked as spam...
>
>> I would expect the Random number generator to implement from Iterator
> using an integer for current() value
>> and providing utility functions (simple functions or methods of another
> class) for shuffle, random array entry like this:
>
> Thank you very much. This suggestion makes sense and looks very structured.
>
> However, I feel that in the PHP world, this structuring is a bit too much.
> Ease of use is very important in PHP.

One important reason for a PRNGInterface is (at least how I understand
it) to allow implementing PRNG algorithms in userland but there would be
no performant way to use this for shuffle and friends that's why I think
such functions should consume any PRNGInterface instead of building it
directly into the PRNG class.

-> If you really want these as PRNG methods - one possibility could be a
Trait providing these methods but this is very exotic for core
functionalities.


Another thing with the proposed `next*` methods is

* How to get the current value? retrieving the next value only seems to
be out of sync with the current PHP way (Iterator, Generator)

* `nextByte(int $length)` How much values does this consume? How much
random bytes can be generated with one value (4 bytes, 8 bytes, platform
dependent). Does this consume multiple values to generate 1 byte 4/8 times?

* same for `nextDouble()` on 32 bit platforms

* Btw. it should be called `nextFloat()` as there is no double type in
PHP and the float type is equivalent to C double precision

>
> Nevertheless, I am satisfied with this proposal. I'd like to hear from
> someone who is more familiar with the PHP context about the form of the
> implementation.
>
> But, it may be faster to actually hold a vote to ask this question. I'm not
> sure how much support this proposal has at this point, but do you think
> it's worth a try?

I'm don't have voting rights and I'm not very active in discussions so I
would be better to get some reasonable thoughts from someone more involved.


Cheers

Marc

>
>
> 2020年12月29日(火) 18:26 Marc :
>
>> Hi zeriyoshi,
>> On 23.12.20 14:41, zeriyoshi wrote:
>>
>> Thanks tyson.
>>
>>
>> This would also make it easier to use those generators in brand new
>>
>> algorithms that weren't in the ionitial RFC.
>>
>> (or in algorithms written by users in PHP)
>>
>> This suggestion seems to make sense. Maybe the RNG should only focus on
>> generating random numbers and not anything else.
>> However, the fact that array and string manipulation functions are no
>> longer native to C may have a speed disadvantage.
>>
>> So I came up with the idea of minimizing the interface definition as RNG.
>>
>> ```
>> interface PRNGInterface
>> {
>> public function nextInt(?int $min = null, ?int $max = null): int;
>> public function nextDouble(): double; // maybe, non-needed.
>> public function nextByte(int $length): string;
>> }
>> ```
>>
>> The methods for array and string operations are defined separately as
>> interfaces that inherit from the interface.
>>
>> ```
>> interface RandomInterface extends PRNGInterface
>> {
>> public function shuffle(array &$array): bool;
>> public function arrayRand(array $array, int $num = 1): int|string|array;
>> public function strShuffle(string $string): string;
>> }
>> ```
>>
>> This can be overly structured, but it will serve all purposes.
>>
>> Personally I feel the interfaces still looking a bit off to me.
>>
>> I would expect the Random number generator to implement from Iterator
>> using an integer for `current()` value
>>
>> and providing utility functions (simple functions or methods of another
>> class) for shuffle, random array entry like this:
>>
>> ```php
>>
>> interface RNG extends Iterator {
>> public function rewind();
>> public function next();
>> public function current(): int;
>> public function key(): int;
>> public function valid();
>> }
>>
>>
>> interface PRNG extends RNG {
>> public function __current(int $seed);
>> public function getSeed(): int;
>> }
>>
>> class RNGUtil {
>> public static function shuffleArray(int $randomNumber, array $arr): 
>> array;
>> public static function randomArrayElement(int $randomNumber, array 
>> $arr): mixed;
>> public static function between(int $randomNumber, int $min = 
>> PHP_INT_MIN, int $max = PHP_INT_MAX): int;
>> public static function bytes(RNG $rng, int $length): string;
>> }
>>
>> ```
>>
>>
>> Regards,
>> Go Kudo
>>
>>
>> 2020年12月23日(水) 0:40 tyson andre  
>> :
>>
>>
>> Hi Go Kudo,
>>
>> **A possible alternative that is widely used in other programming
>> languages is to limit the interface API to only generating bytes/integers,**
>> and to provide global functions that would use generic random number
>> generator objects (from internal or user-provided code) in their algorithms.
>>
>> This would also make it easier to use those generators in brand new