Hi Rob
On 08.06.26 17:13, Rob Landers wrote:
On Sun, May 10, 2026, at 21:02, Seifeddine Gmati wrote:
- RFC: https://wiki.php.net/rfc/bound_erased_generic_types
- Implementation: https://github.com/php/php-src/pull/21969
For those not in discord, I spent nearly a week attempting to
implement reified generics on top of this branch to see how
challenging it would be.
I have a working implementation:
https://github.com/php/php-src/compare/master...bottledcode:php-src:reify
I think it's worth looking back at Nikita's comment when he researched
generics, and I think it holds today. It touches pretty much on
everything discussed.
https://www.reddit.com/r/PHP/comments/j65968/comment/g83skiz/
Classes are monomorphized and similar to Gina's substitution approach
-- a monomorphized class shares as much memory as the templated class
as possible, mainly holding its substitutions. This happens during
runtime, only when it cannot be done at compile time (which is mostly
already handled by this branch).
Nikita addresses this here:
The main problem with monomorphization is not so much performance (it
is theoretically good for performance, and even an otherwise reified
generics implementation may wish to monomorphize hot classes for
performance reasons), and more about memory usage. It requires a
separate class to be generated for each combination of type arguments.
If that also involves duplication all methods (which may depend on
type arguments), this will need a lot of memory.
Monomorphization as a primary implementation strategy doesn't make a
lot of sense in PHP: It is important for languages like C++ or Rust,
where the ability to specialize code for specific types is highly
performance critical (and even so code size remains a big problem). In
PHP, we will not get enough performance benefit out of it to justify
the memory cost (again, when talking about blanket monomorphization).
Especially as it's not clear how it would be possible to cache
monomorphized methods in opcache (due to immutability requirements).
The only reason why monomorphization was suggested as an
implementation strategy at all is that it would make the
implementation of a naive generics model simpler: The premise is that
we just need to generate new class entries for everything, and the
rest of the engine doesn't need to know anything about generics.
However, this doesn't hold up once you consider variance for generic
parameters (Traversable<int> is a Traversable<int|string>), as such
relations cannot really be modelled without direct knowledge of the
generic parameters.
With full monomorphized generics, we can discover new variants of a
generic class effectively forever. This is especially bad for generic
data structures, which is also generics' biggest use-case. In practice,
this means higher memory consumption (incl. higher risk of frequent
opcache restarts due to filling shared memory) and more time looking up
classes, for no benefit. That said, it's hard to argue one way or
another without concrete numbers, so that should be the first step to
progress the conversation beyond what we had 6 years ago.
Secondly, I was able to get limited inference "for free" with this
approach.
From my testing on your branch, the type inference is very limited.
First of all, Nikita explains why type inference is important to begin with:
Generics are already hard on a purely conceptual level -- while we
tend to talk about the implementation issues, as these are the
immediate blocker, there's plenty of design aspects that remain
unclear. One part that bothers me in particular is the question of
type inference:
function test(): List<int> {
// We don't want to write this:
return new List<int>(1, 2, 3);
// We want to write this:
return new List(1, 2, 3);
}
We certainly wouldn't want people to write out more types in PHP than
they would do in a modern statically typed language like Rust.
However, I don't really see how type inference for generic parameters
could currently be integrated into PHP, primarily due to the very
limited view of the codebase the PHP compiler has (it only sees one
file at a time). The above example is obvious, but nearly anything
beyond that seems to quickly shift into "impossible".
The above example seems to apply to your implementation (while working
in Seifeddine's branch):
class Box<T> {
public function __construct(public T $value) {}
}
// Cannot instantiate generic class Box without type arguments;
type parameter T has no default
$box = new Box(42);
// Ok
$box = new Box::<int>(42);
PHP static analyzers are necessarily very capable at static type
inference, so this is at least a significant discrepancy between your
implementation and a type erasure approach. This actually also questions
the gradual typing narrative for type erasure (i.e. "we can add type
checks later"), because it's questionable a reified approach will
support the exact set of (correct) programs the erased approach will.
That's especially true because PHP does not have an official static
analyzer, and the 3rd party analyzers do practically diverge in the details.
It seems for functions, T is not required to be inferred unless used,
likely to circumvent the above limitation. This can lead to confusing cases.
// Works
function a<T>(T $value) { b($value); }
function b<T>(T $value) {}
a(42);
// Breaks (also crashes with ASan due to access to uninitialized
memory)
function a<T>(T $value) { b($value); }
function b<T>(T $value) { new Box::<T>($value); }
a(42);
// Works
function a<T>(T $value) { b::<T>($value); }
function b<T>(T $value) { new Box::<T>($value); }
a::<int>(42);
In other words, whether a(42) is safe to call for signature `function
a<T>(T $value)` depends on its implementation, i.e. whether it makes use
of T. This is introduces a variation of the function coloring problem,
where now all callers must specify generic types recursively. The only
way to fix this inconsistency (apart from actual type inference) is to
require specifying the type explicitly at all times.
I ran into quite a few other ASan crashes when testing examples, as with
PHPs own test suite. Given I'm not sure how complete the implementation
intends to be, I won't go into those any further.
Ilija