Hi
Am 2025-12-01 22:36, schrieb Larry Garfield:
Hi folks. Ilija and I would like to present our latest RFC endeavor,
pattern matching:
https://wiki.php.net/rfc/pattern-matching
Thank you. I've already provided some feedback on the PR (and in Ilija's
DMs) after seeing it “show up” in php-src, but I promised to put it
on-list. I'm doing that now. This email is not a full review of the RFC,
just the parts that I already noted. I think I also noted them before in
the previous on-list “pre RFC” discussion.
1.
I'd like to see `$foo is $bar` with a single top-level variable binding
on the right side disallowed at compile time. This pattern is just an
elaborate way of writing an assignment, thus (almost) never useful, but
possibly confusing to folks coming from Python or just generally
unfamiliar with PHP's pattern matching. I'd be okay with allowing `$foo
is ($bar)` with the explicit parentheses, since this is an established
pattern to indicate “yes, I meant it like this” when using assignments
inside of conditionals, such as `while (($row = $statement->fetch()))`
as a short form of `while (($row = $statement->fetch()) !== false)`.
2.
I'd like to see required parentheses around “combinator patterns”
(specifically `&` and `|`). The RFC currently claims:
While patterns may resemble other language constructs, whatever follows
is is a pattern, not some other instruction.
which is false. “Whatever follows is a pattern, unless it no longer is a
pattern” would be more accurate, which is not very helpful to
intuitively determine the end of a pattern when visually scanning the
code, particularly with the spacing around the combinators that the RFC
suggests.
The main issue is that pattern matching embeds a DSL with its own syntax
inside of PHP, but without having a clear “pattern ends here delimiter”.
I've also looked at other programming languages with pattern matching
and most only support pattern matching at a small number of “special
locations” (e.g. as part of a `match()` construct or only within a
function signature) and also do not have a concept of “union” or
“intersection” patterns. PHP allows to embed patterns (and the
associated DSL) into arbitrary (complex) expressions, which is not
something that is commonly seen as far as I can tell. C# appears to
support it as well, but the overall syntax for pattern matching is quite
different there, which makes it hard to directly compare it.
The “atomic patterns” as a top-level are fine, since they are either a
single “word” without any spaces or already have clear delimiters (such
as the array pattern).
To give an example: Consider line wrapping within a single pattern (e.g.
because the class names get long or because the pattern matching
expression is deeply indented). The most natural way I can come up with,
without introducing parentheses, is the following:
if (
$foo is Foo(some: "stuff", :$here)
& Bar(with: "more_stuff")
) { }
I find it incredibly non-obvious that `& Bar` still belongs to the
pattern. And with just a single non-whitespace character change it
becomes something entirely different, but also valid:
if (
$foo is Foo(some: "stuff", :$here)
&& Bar(with: "more_stuff")
) { }
I believe it is not unlikely that the former is interpreted as a typo
for the latter by someone inexperienced with pattern matching.
For `$foo is $bar&?User` there is some ambiguity if that should have
been `($foo is $bar) ? User : …` (i.e. a ternary with a constant `User`
in the “then” part). It might be clear grammar-wise, but not necessarily
immediately to a reader.
The proposed pattern matching semantics are extremely powerful, but do
not come with any integrated guardrails, which make them extremely
complex and easy to “hold wrong”. I find this a step backwards from the
recent developments making PHP safer and more predictable without
forcing users to learn all rules by heart.
We already had issues with “gobble up everything until you can't” with
short closures and pipes in 8.5. I fear the same happening with pattern
matching. As an example the (future scope) range pattern `$foo is
1..=10` can easily be typoed or misremembered as `$foo is 1...10` which
to my understanding would be valid PHP code equivalent to `($foo is 1.0)
. (0.10)` (which is not useful, but valid). Taking into account possible
guardrails right away can help ensure that future additions can be added
in a way that is consistent with existing “look and feel”.
Best regards
Tim Düsterhus