[Python-Dev] Re: The semantics of pattern matching for Python

Tobias Kohn Fri, 20 Nov 2020 11:02:51 -0800

 Hi Daniel and Mark,

Sorry for being slightly late to the party, but please let me add afew remarks of my own to the discussion here.


1. MUST HAVE PRECISELY DEFINED SEMANTICS

Yes, there are some aspects that we left open intentionally. Mostprominently the question of how often the pattern matching engine willcheck whether the subject is an instance of a particular class. Takethe following trivial example::


  match some_data:
      case Pair(12, 34):
          ...
      case Triple(12, 34, z):
          ...
      case Pair(12, y):
          ...
      case Pair(x, y):
          ...

In a perfect world, the compiler discovers that it must check whether``some_data`` is an instance of ``Pair`` exactly once and not threetimes. This, of course, plays right into Mark's second point onefficiency and seems obvious enough. Yet, as soon as we areconsidering nested patterns, it turns much less obvious whether thecompiler is supposed to cache repeated isinstance-checks. Can wereally expect that the compiler must discover that in both caseclauses the first element is checked against the same class? Or wouldit make more sense to simply expect the compiler to potentiallyperform this ``Num`` instance check twice::


  match some_data:
      case [ Num(), 12 ]:
          ...
      case [ Num(), y, *z ]:
          ...

It is easy to think of cases where we accidentally end up calling anisinstance check more than once because the compiler could not provethat they are equal. Still, whenever possible we want to give thecompiler the freedom to optimise the pattern matching statement bycaching.

In a static language, all of this would not be an issue at all, ofcourse. In Python, however, we end up being caught between itsdynamic features and the desire to make pattern matching reasonablyefficient. So, we ended up leaving the question open as how often thepattern matching engine is allowed or supposed to check instances. Naturally, if you go and write some isinstance-check on a class withside-effects, you can break it.

2. USERS SHOULD NOT HAVE TO PAY AN UNNECESSARY PERFORMANCE PENALTY TOUSE PATTERN MATCHING


To quote Mark [1] here:

/> Users should not have to pay an unnecessary performance penalty touse pattern matching./

Alright, what does this even mean? What is an unnecessary performancepenalty? How should that be measured or compared?

Pattern matching is not just fancy syntax for an if-elif-statement,but a new way of writing and expressing structure. There is currentlynothing in Python that fully compares to pattern matching (which isobviously why we propose to add in the first place). So, do you wantto compare a pattern matching structure to an if-elif-chain or ratheran implementation using reflection and/or the visitor pattern? Whenimplementing pattern matching, would we be allowed to trade off alittle speed handling the first pattern for moving faster to patternsfurther down?

Do our PEPs really read to you like we went out of our ways to make itslow or inefficient? Sure, we said let's start with an implementationthat is correct and worry about optimising it later. But I thoughtthis is 101 of software engineering, anyway, and am thus rathersurprised to find this item on the list.


3. FAILED MATCHES SHOULD NOT POLLUTE THE ENCLOSING NAMESPACE

This is a slightly wider issue that has obviously sparked an entirediscussion on this mailing list on scopes.

If there is a good solution that only assigns variables once theentire pattern matched, I would be very happy with that. However, Ithink that variables should be assigned in full before evaluating anyguards---even at the risk of the guard failing and variables beingassigned that are not used later on. Anything else would obviouslyintroduce a mini-scope and lead to shadowing, which hardly improvesanything with respect to legibility.


4. OBJECTS SHOULD BE ABLE DETERMINE WHICH PATTERNS THEY MATCH

Short version: no!

Class patterns are an extension of instance checks. Leaving out themeta-classes at this point, it is basically the class that isresponsible for determining if an object is an instance of it. Pattern matching follows the same logic, whereas Mark suggests to putthat upside-down. Since you certainly do not want to define themachinery in each instance, you end up delegating the entire thing tothe class, anyway.

I find this suggestion also somewhat strange in light of the historyof our PEPs. We started with a more complex protocol that would allowfor customised patterns, which was then ditched because it was felt asbeing too complicated. There is still a possibility to add it lateron, of course. But here we are with Mark proposing to introduce acomplex protocol again. It would obviously also mean that we couldnot rely as much on Python's existing infrastructure, which makesefficient pattern matching harder, again. I completely fail to seewhat should be gained by this.


5. IT SHOULD DISTINGUISH BETWEEN A FAILED MATCH AND AN ERRONEOUS PATTERN

This seems like a reasonable idea. However, I do not think it iscompatible to Python's existing culture. Let's pick up Mark's exampleof an object ``RemoteCount`` with two attributes ``success`` and``total``. You can then execute the following line in Python withoutgetting any issues from the interpreter::


  my_remote_count.count = 3

Python does not discover that the attribute should have been ``total``rather than ``count`` here. From a software engineering perspective,this is unfortunate and I would be surprised if there was anyone onthis list who was never bitten by this. But this is one of the priceswe have to pay for Python's elegance in other aspects. Requiring thatpattern matching suddenly solves this is just not realistic.


6. SYNTAX AND SEMANTICS

There are some rather strange elements in this, such as the idea thatthe OR-pattern should be avoided. In the matching process, you arealso talking about matching an expression (under point 2), forinstance; you might not really be aware of the issues of allowingexpressions in patterns in the first place.

---

It is most certainly a good idea to start with guiding principles tothen design and build a new feature like pattern matching. Incidentally, this is what we actually did before going into thedetails of our proposal. As evidenced by extensive (!) documentationon our part, there is also a vision behind our proposal for patternmatching.

In my view, Mark's proposal completely fails to provide a vision orany rationale for the guiding principles, other than reference to somemysterious "user" who "should" or "should not" do certain things. Furthermore, there are various obvious holes and imprecisions thatwould have to be addressed.


Kind regards,
Tobias

[1] https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst


Quoting Daniel Moisset <dfmois...@gmail.com>:

[sorry for the duplicate, meant to reply-all]
     
Thank you for this approach, I find it really helpful to put theconversation in these terms (semantics and guiding principles). This is not an answer to the proposal (which I've read andhelps me contextualize) but to your points below and how they applyto PEP-634. I'm also answering personally, with a reasonable guessabout what the other authors of 634-636 would agree, but they maycorrect me if I'm wrong.
         On Mon, 16 Nov 2020 at 14:44, Mark Shannon <m...@hotpy.org> wrote:
(...)
I believe that a pattern matching implementation must have the following
properties:

* The semantics must be precisely defined.
* It must be implemented efficiently.
* Failed matches must not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.
* It should be able to handle erroneous patterns, beyond just syntax errors.

PEP 634 and PEP 642 don't have *any* of these properties.
      
     Let me answer this one by one:
      
     1. "The semantics must be precisely defined":
If this happens in PEP634 I don't think it was intentional, andI'm pretty sure the authors would be happy to complete anyincompleteness that it has. I would happily have a more accuratedescription (I drafted a non-official one for a much earlier versionofPEP-622, https://github.com/dmoisset/notebook/blob/master/python/pep622/semantic-specs.md ). Can you clarify where you see theseimprecisions?
      
     2. "It must be implemented efficiently":
I don't think "efficient implementation" was a priority inPEP634, although I saw your proposal defines this as "sameperformance as the equivalent if statement", and I'm quite sure thatlevel of performance can be achieved (if it isn't already byBrandt's implementation). Finding the best way to optimise wasn't apriority, but I think if there was anything in our implementationthat would make optimisations harder we would consider them as achange. Do you think anything like that has been presented?
      
     3. "Failed matches must not pollute the enclosing namespace": 
This for me is one of the less-desirable parts of the proposal,and was agreed more as a matter of practicality and an engineeringtradeoff. If you have a reasonable way of solving this (like puttingmatched variables in the stack and popping it later) reasonably I'dbe much happier putting that in.
      
     4. "Objects should be able determine which patterns they match."
This is something that you and I, and most of the authors of622 agree on. What we found out when discussing this is that wedidn't have clear what and how to open that customization. Somecustomization options added a lot of complexity at the cost ofperformance, some others were very simple but it wasn't clear thatthey would be actually useful, or extensible in the future. This hasa lot to do with this being a somewhat new paradigm in Python, andour lack of knowledge on what the user community may do with itbeyond what we imagined. So the decision was "pattern matching as itis presented without extensibility is useful, let's get this in, andonce we see how it is used in the wild we'll understand better whatkind of extensibility is valuable". For a crude analogy, imaginetrying to get the descriptor protocol right when the basic pythonobject model was being implemented. These things happened asdifferent times as the use of the language evolved, and my opinionis that customization of the match protocol must follow a similarpath.
      
5. "It should be able to handle erroneous patterns, beyond justsyntax errors."I'll be answering this based on the example in your document,matching RemoteCount(success=True, count=count) where RemoteCount isa namedtuple. The code is likely an error, and I'm in general allfor reporting errors early, but the kind of error detection proposedhere is the kind of errors that python normally ignore. I find thatexample really similar to the kind of error you could make writing"if remcount.count == 3: ..." or going beyond this example "maxelems= configfile.read(); if len(elems) == maxelems: ...". There are manytype errors that python ignores (especially related to equality),and Python has already made the decision of allowing mixed typeequality comparisons everywhere, so why do you think patternmatching should be different with respect to this? In my opiniontrying to "fix" this (or even agreeing if this is a bug or not) is amuch more general issue unrelated to pattern matching. Given thecurrent status-quo I normally trust python type checkers to help mewith these errors, and I'd expect them to do the same with the"erroneous" match statement.If there are other examples you had in mind when you wrote this I'dalso be happy to discuss those.
      
I'll try to get some time to review your specificcounterproposal later, thanks for it.
      
     Best,
         Daniel
 

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SIK62L2A3X37G52OR34HGHK6BIDLKGAT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

Reply via email to