Re: [PHP-DEV] [RFC] Pattern Matching

Larry Garfield Tue, 09 Dec 2025 09:24:06 -0800

Hi Bob

> I'd like to note that the class-access is very ugly.
> ...
> But what's not fine is using an inconsistent syntax for variable bindings 
> across different contexts. In arrays binding is just a bare variable. In 
> objects it suddenly needs a colon? What.
> ...
> This also works for future ADTs. Move::Forward&{ $amount }. Then, if there's 
> a desire to actually *positionally* match an object. Then it's logical to use 
> a parenthesized expression, for a tuple. I.e.:


We'd argue that using `()` and `{}` to differentiate between positional and 
named parameters is no more intuitive than a `:` prefix. To look at preexisting 
concepts in the language: Named vs. positional arguments aren't determined by 
the type of braces, but by whether they are preceded by the argument name. 
`(:$param)` can be viewed as an extension of `(param: $param)` where the 
redundant argument name is dropped, while `{ $param }` looks like something 
completely different.  (Technnically, adding this shorthand to named argument 
calls would be possible, though we are not proposing it here.)

We also recently discovered that this is exactly the approach Dart takes:
https://dart.dev/language/patterns#destructuring-class-instances

So there is prior art.  Also, this is only a question for the shorthand syntax. 
 If using the full version, there is no room for confusion.

Additionally, as noted in the RFC, we ran into issues for `{}` both when using 
and omitting `&`.  Without a `&`, `{}` is confused for a hook when in the 
default property value position.  With a `&`, runtime disambiguation is 
required for ADTs and a class constants, which is inconsistent with the rest of 
the language.  (Unless you can suggest a way to resolve that ambiguity, in 
which case we can consider it.)  More on this topic below.

> Positional binding is quite intuitively using parenthesis - you construct the 
> enum with Foo::Bar($var) and you read it back on the right hand side with 
> Foo::Bar($var).

Yes, this is precisely the syntax we had in mind for ADTs. So we're in 
agreement here.

> It naturally allows destructuring without class name.

This is an upside of the split syntax, yes. However, as noted, that led to 
parsing issues, which is why we moved away from it.

While a bit less visually pleasing, it would be possible to allow `_` or `*`, 
or even `object` as a pseudo-class name to indicate "any class."  That would be 
a much simpler solution to the "I don't care about the object type" question.  
Eg: `$p is _(x: 5)`.

> I've also heard a consideration about "Foo::Bar & { $var }" being ambiguous 
> with respect to "is Foo::Bar now a const or an ADT class". This may be 
> resolved in the VM. I don't consider this a major issue, and is simply 
> something which can be disambiguated at optimizer-time or run-time, depending 
> on what type of symbol it is.

PHP is very explicit when it comes to distinguishing member types 
syntactically. For example, in many languages, `foo.bar` could access a field, 
static field, constant, function reference, static function reference, subtype, 
etc. PHP doesn't do that, it uses `$foo->bar`, `foo::$bar`, `foo::bar`, 
`$foo->bar(...)`, `foo::bar(...)`, `foo\bar`, etc. It goes out of its way to 
make it obvious what kind of member is being accessed. This makes a lot of 
sense for PHP, because each PHP file is compiled in a fully isolated context 
where we frequently don't know what the class `Foo` looks like. This avoids a 
lot of guesswork for the engine.

There are very few exceptions to this rule, one being `Foo::bar()`, which is 
normally a static call but _can_ also be an instance call when `Foo` is an 
ancestor of the class of the current instance. This has already caused some 
issues in the past. We tried to automatically make closures static that don't 
use `$this`, which is unsound because of these hidden instance call. We feel 
it's wise to avoid adding more such cases.

An alternative would be to disallow matching against an ADT's case name 
entirely, allowing `$p is Point & {$x}` but not `$x is Option::Some & {$val}`.  
The latter would have to be positional only, `$x is Option::Some($y)`.  That 
seems like a rather arbitrary and unexpected restriction, however.

> I'm deeply unsatisfied by the handling of object properties:
>
> "Note that matching against a property's value implies reading that 
> property's value", "If the property is uninitialized, an error will be 
> thrown." and "If the property is undefined and none of the above apply, it 
> will evaluate to null and a Warning will be issued."
>
> This is wildly inconsistent with arrays:

The rationale for treating objects and arrays differently is that objects are 
almost always structured (meaning we know and should be able to rely on which 
properties it defines), while arrays are frequently not. The obvious exception 
you already mentioned is `stdClass`, although how useful or common that is in 
practice at this point is debatable.  (I cannot recall the last time I saw 
`json_decode()` called without the flag to use an associative array instead.)  
`stdClass` should not be fundamental in shaping how this pattern works. You can 
also efficiently cast `stdClass` to `array` (because that's how `stdClass` is 
implemented anyway) to use the array pattern.

That said, this isn't a hill either of us wants to die on, so if the consensus 
is to swallow such cases and just return false, we will go with that.  
(Meaning, other people, please weigh in here.)

> It also means that uninitialized properties forcibly throw.

With fairly few exceptions, an uninitialized property on an object that is 
passed back to a caller is a sign of a design flaw.  Even if the construction 
process is multi-step -- as it is in Crell/AttributeUtils or deserializers, 
factories, ORMs, etc. -- everything is initialized by the time the object is 
returned to the caller.  Leaving properties uninitialized, or unsetting them 
manually implicitly adds an undocumented "undefined" type to your type union. 
There's absolutely no indication that accessing a property may be unsafe, but 
it is. There are niche cases where unsetting properties is useful (e.g. 
breaking cycles, as you've mentioned privately), but such objects shouldn't 
escape to users with a half-initialized state.  So there's a 99% chance that if 
the developer pattern matches against an uninitialized property, it's a 
developer error and should be corrected, not silently suppressed.

The other issue is that `$p is Point(:$x)` looks like an infallible pattern 
(assuming `$p` is an instance of `Point`) where we just want to extract `$x`, 
but if `$point->x` is uninitialized this pattern would fail. Given this is 
almost certainly a mistake, it seems better to inform the user about it rather 
than silently returning `false`.

> E.g. (assuming something like "class ResponseOrError { string $type; 
> Exception $e; string $response; }"):

This is a classic example of when ADTs would be useful.

> Enforcing positional arrays however will be quite surprising if e.g. an entry 
> was removed:
> $a = [1, 2, 3];
> unset($a[1]);
> if ($a is [1, 3]) {
>     // huh? It's [1, 2 => 3], not [1, 3].
> }

I think you misunderstood the question here. `[1, 2]` and `[0 => 1, 1 => 2]` 
should always be equivalent in terms of key behavior. What we weren't in 
agreement about is whether `[1, 2]` should have additional guarantees about 
order, i.e. about whether `[1, 2]` need to appear in that order, or whether `[1 
=> 2, 0 => 1]` is also acceptable. Normally, values are accessed by key which 
makes order mostly irrelevant, but there are cases where the user might expect 
to see values in a specific order when iterating over the array. Implementing 
this order-consistency check in a performant way is also somewhat tricky, and 
it would always add some overhead no matter what.

--Larry Garfield

Re: [PHP-DEV] [RFC] Pattern Matching

Reply via email to