On 19.8.2024 19:08:32, Derick Rethans wrote:
Hi!

Arnaud, Larry, and I have been working on an article describing the
state of generics and collections, and related "experiments".

You can find this article on the PHP Foundation's Blog:
https://thephp.foundation/blog/2024/08/19/state-of-generics-and-collections/

cheers,
Derick

Hey Derick,

The fluid Arrays section says "A PoC has been implemented, but the performance impact is still uncertain". Where may I find that PoC for my curiosity? I'm imagining the implementation of the array types as a counted collection of types of the entries. But without the PoC I may only guess.

It also says "Another issue is that [...] typed properties may not be possible.". Why would that be the case? Essentially a typed property would just be a static array, which you describe in the section right below.

Also you are mentioning references. References to static arrays (typed property case) are trivial. References to fluid arrays would probably require runtime lookup of the contained references to determine the actual full type. Which may be a valid tradeoff, given that the very most arrays don't contain any or many references. ("Either you don't use references or you pay an O(contained references) overhead when passing around.")


So, reading the conclusion, I'm a bit taken disappointed by:

  * Halt efforts on typed arrays, as our current thoughts are that it
    is probably not worth doing, due to the complexities of how arrays
    work, and the minimal functionality that it would bring.

I'd truly appreciate more investigation into the topic, as I feel the functionality would definitely not be minor to PHP users.


Regarding the Collections PR, I personally really don't like it:

 * It implements something which would be trivial if we had reified
   generics. If this ever gets merged, and generics happen later, it
   would be probably outdated and quirkiness the language has to carry
   around.
 * It's not powerful. But rather a quite limited implementation. No
   overrides of the built-in methods possible. No custom operations ("I
   want a dict where a specific property on the key is the actual
   unique key", "I want a custom callback be executed for each
   modification"). It's okay as a PoC, but far from a complete enough
   implementation.
 * It's a very specialized structure/syntax, not extensible for
   userland at all. Some functionality like generic traits, where you'd
   actually monomorphize the contained methods would be much more
   flexible. E.g. class Articles { use Sequence<Article>; }. Much less
   specialized syntax, much more extensible. And generic traits would
   be doable, regardless of the rest of the generics investigation.
   In fact, generic traits (essentially statically replacing the
   generic arguments at link-time) would be an useful feature which
   would remain useful even if we had fully reified generics.
   I recognize that some functionality will need support of internal
   zend_object_handlers. But that's not a blocker, we might provide
   some default internal traits with PHP, enabling the internal class
   handlers.

So to summarize, I would not continue on that path, but really invest into monomorphizable generic traits instead.


Remains the last point about erased generics being acceptable:

 * If we ever end up adding actual reified generics (maybe due to a
   renewed investigation in 5 years), we'll most likely want to retain
   the syntax. There may be some syntax which cannot be supported
   though, or semantics which would have to break existing code.
 * Docblocks sort of an extensible and modifiable standard. Some type
   checkers allow e.g. List<positive-int>. But PHP certainly won't
   support it. So you will end up in a hybrid state where some
   functions use generics and some use only docblocks, because they're
   not powerful enough. Further, if you use both (e.g. List<int> in
   definition, List<positive-int> in docblock), you also have to make
   sure to keep them in sync, because the generic type doesn't get
   verfied through execution.
 * We're used to "all types specified are checked". And that's a good
   thing. It sets expectations.
   Now imagine we're introducing type aliases. "type IntList =
   List<int>;". Function signature "function processIntegers(IntList
   $list)". This looks like I could expect something actually being an
   IntList. There's no generic immediately in sight telling me that
   this is only going to provide me a List of arbitrary values. I will
   expect an IntList. Just like I will expect any bare "int" type to
   also give me an integer.

So, overall, I think erased generics set the wrong expectations and have quite a risk to be a bad decision in light of possible future improvements.


I'd also like to leave a small side note on this question:

What generic features are acceptable to leave out to make the implementation more feasible?
I think this asks the wrong question. First, figure out, what generic features really cannot make it, then figure out whether omitting these features is acceptable.


Thanks all for investing time into this topic, I'm sure it will bring the language forward!

Bob

Reply via email to