On 19.8.2024 19:08:32, Derick Rethans wrote:
Hi!
Arnaud, Larry, and I have been working on an article describing the
state of generics and collections, and related "experiments".
You can find this article on the PHP Foundation's Blog:
https://thephp.foundation/blog/2024/08/19/state-of-generics-and-collections/
cheers,
Derick
Hey Derick,
The fluid Arrays section says "A PoC has been implemented, but the
performance impact is still uncertain". Where may I find that PoC for my
curiosity? I'm imagining the implementation of the array types as a
counted collection of types of the entries. But without the PoC I may
only guess.
It also says "Another issue is that [...] typed properties may not be
possible.". Why would that be the case? Essentially a typed property
would just be a static array, which you describe in the section right below.
Also you are mentioning references. References to static arrays (typed
property case) are trivial. References to fluid arrays would probably
require runtime lookup of the contained references to determine the
actual full type. Which may be a valid tradeoff, given that the very
most arrays don't contain any or many references. ("Either you don't use
references or you pay an O(contained references) overhead when passing
around.")
So, reading the conclusion, I'm a bit taken disappointed by:
* Halt efforts on typed arrays, as our current thoughts are that it
is probably not worth doing, due to the complexities of how arrays
work, and the minimal functionality that it would bring.
I'd truly appreciate more investigation into the topic, as I feel the
functionality would definitely not be minor to PHP users.
Regarding the Collections PR, I personally really don't like it:
* It implements something which would be trivial if we had reified
generics. If this ever gets merged, and generics happen later, it
would be probably outdated and quirkiness the language has to carry
around.
* It's not powerful. But rather a quite limited implementation. No
overrides of the built-in methods possible. No custom operations ("I
want a dict where a specific property on the key is the actual
unique key", "I want a custom callback be executed for each
modification"). It's okay as a PoC, but far from a complete enough
implementation.
* It's a very specialized structure/syntax, not extensible for
userland at all. Some functionality like generic traits, where you'd
actually monomorphize the contained methods would be much more
flexible. E.g. class Articles { use Sequence<Article>; }. Much less
specialized syntax, much more extensible. And generic traits would
be doable, regardless of the rest of the generics investigation.
In fact, generic traits (essentially statically replacing the
generic arguments at link-time) would be an useful feature which
would remain useful even if we had fully reified generics.
I recognize that some functionality will need support of internal
zend_object_handlers. But that's not a blocker, we might provide
some default internal traits with PHP, enabling the internal class
handlers.
So to summarize, I would not continue on that path, but really invest
into monomorphizable generic traits instead.
Remains the last point about erased generics being acceptable:
* If we ever end up adding actual reified generics (maybe due to a
renewed investigation in 5 years), we'll most likely want to retain
the syntax. There may be some syntax which cannot be supported
though, or semantics which would have to break existing code.
* Docblocks sort of an extensible and modifiable standard. Some type
checkers allow e.g. List<positive-int>. But PHP certainly won't
support it. So you will end up in a hybrid state where some
functions use generics and some use only docblocks, because they're
not powerful enough. Further, if you use both (e.g. List<int> in
definition, List<positive-int> in docblock), you also have to make
sure to keep them in sync, because the generic type doesn't get
verfied through execution.
* We're used to "all types specified are checked". And that's a good
thing. It sets expectations.
Now imagine we're introducing type aliases. "type IntList =
List<int>;". Function signature "function processIntegers(IntList
$list)". This looks like I could expect something actually being an
IntList. There's no generic immediately in sight telling me that
this is only going to provide me a List of arbitrary values. I will
expect an IntList. Just like I will expect any bare "int" type to
also give me an integer.
So, overall, I think erased generics set the wrong expectations and have
quite a risk to be a bad decision in light of possible future improvements.
I'd also like to leave a small side note on this question:
What generic features are acceptable to leave out to make the
implementation more feasible?
I think this asks the wrong question. First, figure out, what generic
features really cannot make it, then figure out whether omitting these
features is acceptable.
Thanks all for investing time into this topic, I'm sure it will bring
the language forward!
Bob