Hi,

I read the PEP, and a few thoughts:

-----

I think one of the examples is some lib2to3 code? I think the matcher
syntax is really great for that case (parse trees). The matcher syntax is
definitely an improvement over the litany of helper functions and
conditionals otherwise needed.

That said, I have a hard time seeing a particular use of this complicated
pattern matching outside "hetergenous trees" (for lack of a better term) of
objects? I've only really dealt with that problem with parse trees, but
perhaps that just an artifact of the domains I've ended up working in.

In any case, it might be useful to include some/more examples or use cases
that aren't as parser-centric.

-----

Question: How are True, False, None, ..., etc handled? What does this do?

case whatever:
  case True: ...
  case False: ...
  case None: ...
  case ...:

I would expect they would be treated as literals the same as e.g.
numbers/strings, yes? Sorry if I missed this in the PEP.

-----

I, too, had trouble understanding the __match__ protocol from the PEP text.
Brett's comments largely capture my thoughts about this.

-----

The need to use "." to indicate "look up name" to avoid "match anything"
seems like a big foot gun. Simple examples such as:

FOO = 1
match get_case():
  case FOO:
    print("you chose one")

clearly illustrate this, but the problem is present in any case expression:
a missing dot changes the meaning from "match this specific value" to
almost the opposite: "match any value". And all you really need to do is
miss a single leading dot anywhere in the case expression to trigger this.
I agree with Barry (I think he said this) that it seems like an easy cause
of mysterious bugs.

I think the foot-gun aspect derives directly from the change in how a
symbol is interpreted. i.e., Everywhere (predominantly? everything I can
think of atm) else in the language when you see "foo", you know it means
some sort of lookup of the name "foo" is occurring. The exception to this
is fairly simple: when there is some "assignment cue", e.g. "as", :=, =,
import, etc, and those assignment cues are always very close by (pretty
much always the leading/following token?). Anyways, my point is assignment
has a cue close by.

The proposed syntax flips that and mixes it, so it's very confusing.
Sometimes a symbol is a lookup, sometimes it's an assignment.

The PEP talks a bit about this in the "alternatives for constant value
pattern" section. I don't find the rationale in that section particularly
convincing. It basically says using "$FOO" to act as "look up value named
FOO" is rejected because "it is new syntax for a narrow use case" and "name
patterns are common in typical code ... so special syntax for the common
case would be weird".

I don't find that convincing because it seems *more weird* to change the
(otherwise consistent) lookup/assignment behavior of the language for a
specific sub-syntax.

Anyways, when I rewrite the examples and use a token to indicate "matcher",
I personally find them easier to read. I feel this is because it makes the
matcher syntax feel more like templates or string interpolation (or things
of that nature) that have some "placeholder" that gets "bound" to a value
after being given some "input".

It also sort of honors the "assignment only happens with a localized cue"
behavior that already exists.

ORIGIN = 0
case get_point():
  case Point(ORIGIN, $end):
    ...
  case $default:
    print(default)

I will admit this gives me PHP flashbacks, but it's also very clear where
assignments are happening, and I can just use the usual name-lookup rules.
I just used $ since the PEP did.

As a bonus, I also think this largely mediates the foot gun problem because
there's now a cue a binding is happening, so it's easy to trigger our "is
that name already taken, is it safe to assign?" check we mentally perform.

In any case, this seems like a pretty fundamental either/or design decision
*someone* will have to make:

Either:
  names mean assignment, and the rules of what is a lookup vs assignment
are different with some special case support (i.e. leading dot).
Or:
  use some character to indicate assignment, and the lookup rules are the
same.

-----

Related to the above: I also raise this because, in my usage, I doubt I'll
be using it as much more than a switch statement. I rarely have to match
complicated patterns, but very often have a set of values that I need to
test against. The combination of Literal and exhaustive-case checking is
very appealing.

So I'm very often going to want to type, e.g.

ValidModes = Union[Literal[A], Literal[B], etc etc]
def foo(mode: ValidModes):
  match mode:
    case A: ...
    case B: ...
    case etc etc

And eventually I'm going to foot-gun myself with a missing dot.

-----

Related to the above, I *don't* find that e.g. "case Point(...)" *not*
initializing
a Point particularly confusing. This feels like it might be inconsistent
with my whole thing above, but :shrug:. FWIW, I suspect it's just that the
leading "case" cue makes it easy to entirely turn off the "parentheses
means code gets called" logic in my mind-parser.

-----

Related to the above, perhaps an unadorned name shouldn't be allowed? e.g.
this should be invalid:

match get_shape():
  case shape:
    print(shape)

I raise this idea because of the foot-gun issue, but also because it
creates more ways of doing the same thing: binding the name to a value.
Using := doesn't seem like a particularly burdensome solution:

match shape := get_shape():
  case: # or *, or _, or whatever
    print(shape)

And then either only dotted names or patterns are allowed in cases, not
plain names.
-----

Making underscore a special match-anything-but-don't-bind struck me as a
bit odd. Aside from the language grammar rules, there aren't really any
"this is an OK name, this isn't" type of rules.

I think someone else mentioned using "*" instead of "_"? I had the same
exact same thought. If it's not going to be bound to a name, why use an
otherwise valid name to not bind it to? I get the ergonomics of it, but it
seems like another special-case of how things get processed inside the case
expression.

-----

Why | instead of "or" ? "or" is used in other conditionals. This strikes me
as another special case of the syntax that differs from elsewhere in the
language.

-----

I agree with not having flat indentation. I think having "case" indented
from "match" makes it more readable overall.

-----

Anyways, thanks for reading. HTH.


On Tue, Jun 23, 2020 at 9:08 AM Guido van Rossum <gu...@python.org> wrote:

> I'm happy to present a new PEP for the python-dev community to review.
> This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and
> Talin.
>
> Many people have thought about extending Python with a form of pattern
> matching similar to that found in Scala, Rust, F#, Haskell and other
> languages with a functional flavor. The topic has come up regularly on
> python-ideas (most recently yesterday :-).
>
> I'll mostly let the PEP speak for itself:
> - Published: https://www.python.org/dev/peps/pep-0622/ (*)
> - Source: https://github.com/python/peps/blob/master/pep-0622.rst
>
> (*) The published version will hopefully be available soon.
>
> I want to clarify that the design space for such a match statement is
> enormous. For many key decisions the authors have clashed, in some cases we
> have gone back and forth several times, and a few uncomfortable compromises
> were struck. It is quite possible that some major design decisions will
> have to be revisited before this PEP can be accepted. Nevertheless, we're
> happy with the current proposal, and we have provided ample discussion in
> the PEP under the headings of Rejected Ideas and Deferred Ideas. Please
> read those before proposing changes!
>
> I'd like to end with the contents of the README of the repo where we've
> worked on the draft, which is shorter and gives a gentler introduction than
> the PEP itself:
>
>
> # Pattern Matching
>
> This repo contains a draft PEP proposing a `match` statement.
>
> Origins
> -------
>
> The work has several origins:
>
> - Many statically compiled languages (especially functional ones) have
>   a `match` expression, for example
>   [Scala](
> http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html
> ),
>   [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html),
>   [F#](
> https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-matching
> );
> - Several extensive discussions on python-ideas, culminating in a
>   summarizing
>   [blog post](
> https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python/
> )
>   by Tobias Kohn;
> - An independently developed [draft
>   PEP](
> https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst)
>   by Ivan Levkivskyi.
>
> Implementation
> --------------
>
> A full reference implementation written by Brandt Bucher is available
> as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of
> the CPython repo.  This is readily converted to a [pull
> request](https://github.com/brandtbucher/cpython/pull/2)).
>
> Examples
> --------
>
> Some [example code](
> https://github.com/gvanrossum/patma/tree/master/examples/) is available
> from this repo.
>
> Tutorial
> --------
>
> A `match` statement takes an expression and compares it to successive
> patterns given as one or more `case` blocks.  This is superficially
> similar to a `switch` statement in C, Java or JavaScript (an many
> other languages), but much more powerful.
>
> The simplest form compares a target value against one or more literals:
>
> ```py
> def http_error(status):
>     match status:
>         case 400:
>             return "Bad request"
>         case 401:
>             return "Unauthorized"
>         case 403:
>             return "Forbidden"
>         case 404:
>             return "Not found"
>         case 418:
>             return "I'm a teapot"
>         case _:
>             return "Something else"
> ```
>
> Note the last block: the "variable name" `_` acts as a *wildcard* and
> never fails to match.
>
> You can combine several literals in a single pattern using `|` ("or"):
>
> ```py
>         case 401|403|404:
>             return "Not allowed"
> ```
>
> Patterns can look like unpacking assignments, and can be used to bind
> variables:
>
> ```py
> # The target is an (x, y) tuple
> match point:
>     case (0, 0):
>         print("Origin")
>     case (0, y):
>         print(f"Y={y}")
>     case (x, 0):
>         print(f"X={x}")
>     case (x, y):
>         print(f"X={x}, Y={y}")
>     case _:
>         raise ValueError("Not a point")
> ```
>
> Study that one carefully!  The first pattern has two literals, and can
> be thought of as an extension of the literal pattern shown above.  But
> the next two patterns combine a literal and a variable, and the
> variable is *extracted* from the target value (`point`).  The fourth
> pattern is a double extraction, which makes it conceptually similar to
> the unpacking assignment `(x, y) = point`.
>
> If you are using classes to structure your data (e.g. data classes)
> you can use the class name followed by an argument list resembling a
> constructor, but with the ability to extract variables:
>
> ```py
> from dataclasses import dataclass
>
> @dataclass
> class Point:
>     x: int
>     y: int
>
> def whereis(point):
>     match point:
>         case Point(0, 0):
>             print("Origin")
>         case Point(0, y):
>             print(f"Y={y}")
>         case Point(x, 0):
>             print(f"X={x}")
>         case Point():
>             print("Somewhere else")
>         case _:
>             print("Not a point")
> ```
>
> We can use keyword parameters too.  The following patterns are all
> equivalent (and all bind the `y` attribute to the `var` variable):
>
> ```py
> Point(1, var)
> Point(1, y=var)
> Point(x=1, y=var)
> Point(y=var, x=1)
> ```
>
> Patterns can be arbitrarily nested.  For example, if we have a short
> list of points, we could match it like this:
>
> ```py
> match points:
>     case []:
>         print("No points")
>     case [Point(0, 0)]:
>         print("The origin")
>     case [Point(x, y)]:
>         print(f"Single point {x}, {y}")
>     case [Point(0, y1), Point(0, y2)]:
>         print(f"Two on the Y axis at {y1}, {y2}")
>     case _:
>         print("Something else")
> ```
>
> We can add an `if` clause to a pattern, known as a "guard".  If the
> guard is false, `match` goes on to try the next `case` block.  Note
> that variable extraction happens before the guard is evaluated:
>
> ```py
> match point:
>     case Point(x, y) if x == y:
>         print(f"Y=X at {x}")
>     case Point(x, y):
>         print(f"Not on the diagonal")
> ```
>
> Several other key features:
>
> - Like unpacking assignments, tuple and list patterns have exactly the
>   same meaning and actually match arbitrary sequences.  An important
>   exception is that they don't match iterators or strings.
>   (Technically, the target must be an instance of
>   `collections.abc.Sequence`.)
>
> - Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y,
>   *rest)` work similar to wildcards in unpacking assignments.  The
>   name after `*` may also be `_`, so `(x, y, *_)` matches a sequence
>   of at least two items without binding the remaining items.
>
> - Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the
>   `"bandwidth"` and `"latency"` values from a dict.  Unlike sequence
>   patterns, extra keys are ignored.  A wildcard `**rest` is also
>   supported.  (But `**_` would be redundant, so it not allowed.)
>
> - Subpatterns may be extracted using the walrus (`:=`) operator:
>
>   ```py
>   case (Point(x1, y1), p2 := Point(x2, y2)): ...
>   ```
>
> - Patterns may use named constants.  These must be dotted names; a
>   single name can be made into a constant value by prefixing it with a
>   dot to prevent it from being interpreted as a variable extraction:
>
>   ```py
>   RED, GREEN, BLUE = 0, 1, 2
>
>   match color:
>       case .RED:
>           print("I see red!")
>       case .GREEN:
>           print("Grass is green")
>       case .BLUE:
>           print("I'm feeling the blues :(")
>   ```
>
> - Classes can customize how they are matched by defining a
>   `__match__()` method.
>   Read the [PEP](
> https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specification)
> for details.
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7LTSC3QSNIZPNZ26FZ3ZEUCZ3C/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L7KZRZ2YCGGG6XQIQJAWE53NIAD5ZX6G/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to