[Python-Dev] Re: Concerns about PEP 634

Mark Shannon Sun, 07 Feb 2021 06:40:14 -0800

Hi Daniel,

On 06/02/2021 7:47 pm, Daniel Moisset wrote:

Hi Mark,
I think some of these issues have already been raised and replied (evenif no agreement has been reached). but this is a good summary, so let mereply with a summary of responses for this. >On Sat, 6 Feb 2021 at 15:51, Mark Shannon <m...@hotpy.org<mailto:m...@hotpy.org>> wrote:
    Hi,

    Since a decision on PEP 634 is imminent, I'd like to reiterate some
    concerns that I voiced last year.

    I am worried about the semantics and implementation of PEP 634.
    I don't want to comment on the merits of pattern matching in
    general, or
    the proposed syntax in PEP 634 (or PEP 640 or PEP 642).

    Semantics
    ---------

    1. PEP 634 eschews the object model, in favour of adhoc instance
    checks,
    length checks and attribute accesses.

    This is in contrast to almost all of the the rest of the language,
    which
    uses special (dunder) methods:
        All operators,
        subscripting,
        attribute lookup,
        iteration,
        calls,
        tests,
        object creation,
        conversions,
        and the with statement

    AFAICT, no justification is given for this.
    Why can't pattern matching use the object model?
No one has said that "we can't". It's just that "we don't have to". Theunderlying mechanisms used by pattern matching (instance check, length,attribute access) already have their defined protocols and supportwithin the object model. It's analogous as the way in which iterableunpacking didn't need to define it's own object model special methods,because the existing iteration mechanism used in for loops was sufficient.


You seem to be jumping on my use of the word "can't".
I should have said
"Why *doesn't* PEP 634 use the object model?"

As for unpacking, it builds on iteration in a way that is clear and precise.

For example, I can desugar:

a, b = t

into:

try:
    __tmp = iter(t)
except TypeError:
    raise TypeError("cannot unpack non-iterable ...")
try:
    __tmp_a = next(__tmp)
    __tmp_b = next(__tmp)
except StopIteration:
    raise ValueError("not enough values ...")
try:
    next(__tmp)
except StopIteration:
    pass
else:
    raise raise ValueError("too many values ...")
a = __tmp_a; b = __tmp_b

Noting that variables starting "__tmp" are invisible.

Why not do something similar for PEP 634?

This does not exclude possible future extensions to the object model toinclude a richer protocol like described inhttps://www.python.org/dev/peps/pep-0622/#custom-matching-protocol(where it also describes why we're not proposing that *now*, why it canbe done later, and why we think it's best to do it later)

I don't see how this is relevant. It is the semantics of PEP 634 asproposed that concerns me.

    PEP 343 (the "with" statement) added the __enter__ and __exit__ methods
    to the object model, and that works very well.


    2. PEP 634 deliberately introduces a large area of undefined behaviour
    into Python.

    
https://www.python.org/dev/peps/pep-0634/#side-effects-and-undefined-behavior

    Python has, in general, been strict about not having undefined
    behaviour.
    Having defined semantics means that you can reason about programs, even
    non-idiomatic ones.
    [This is not unique to Python, it is really only C and C++ that have
    areas of undefined behaviour]
The C standard uses a very peculiar definition of "undefined behaviour"(I'm not familiar with the C++ one to assert anything, I'll assume it'sthe same), where for certain set of programs, any resulting behaviour isvalid, even at compile time (so a compiler that deletes all your fileswhen trying to compile "void f() {int a[10]; a[10]=0;}" is standardcompliant). Comparing that with the use of the term "undefinedbehaviour" in the PEP is not very useful, because even if they are thesame words, they don't have the same meaning
If you want to compare it with the C standards, the term we'd use wouldbe "implementation defined behaviour". Python has a lot of those. Forexample, the output of all these python programs can change betweenimplementations (And some of these even change between versions of cpython):
  * print({3,2,1})
  * print(hash("hello"))
  * if id(1) == id(1): print("hello")
  * import sys; print(sys.getsizeof([]))

Some order of operations is also implementation dependent for example in
def foo():
    print(open("/etc/passwd")).read()
foo()

The moment where file closing happens is implementation dependent.

The existence of a few cases of undefined behaviour (e.g. raceconditions) does not excuse adding a whole lot more. Especially whenthere is no good reason.

The section you linked to introduces behaviour in similar lines to allof the above.
    I can see no good reason for adding undefined behaviour. It doesn't
    help
    anyone.


It helps for the point 3 that you mention (See below)


No it doesn't.
It is an impediment to performance for the reasons I give here:

    The lack of precise semantics makes programs harder to understand, and
    it makes the language harder to implement.
    If the semantics aren't specified, then the implementation becomes the
    specification.
    This bakes bugs into the language and makes it harder to maintain,
    as bug-for-bug compatibility must be maintained.


    3. Performance

    PEP 634 says that each pattern must be checked in turn.
    That means that multiple redundant checks must be performed on (for
    example) a sequence if there are several mapping patterns.
    This is unnecessarily slow.
What PEP 634 says about this(unless we're reading different sections, aquote could help here) is that the semantics are the one of selectingthe /first case block whose patterns succeeds matching it and whoseguard condition (if present) is "truthy"/.
As long as the semantics are respected, a smart compiler could reducesome of the redundant checks (and that's *precisely* why the order ofthe checks is left as implementation dependent).

What is this mythical "smart compiler". Are you expecting some researchbreakthrough to solve this or an existing algorithm? If the latter, thenwhat algorithm?

It's important here to separate "sequential semantics" vs "implementedas sequential checks". For an analogy, look at the "serializable"concept in databases, where the outcome of multiple statements must beequal to the outcome of operations executed serially, but in practicedatabases can find more efficient non-serial executions with the sameoutcome. The PEP specifies semantics, not how the implementation mustbehave.


I'm puzzled by that last sentence.
Isn't that what semantics are? The meaning of something.
How the implementation behaves determines the meanings of programs.

If the individual steps in a sequence have observable side-effects, thenit changes the semantics to remove, add to, or reorder thoseside-effects. All of the operations you propose using to implementpattern matching potentially have side effects.


    Implementation
    --------------

    My main concern with the implementation is that it does too much work
    into the interpreter.
    Much of that work can and should be done in the compiler.
    For example, deep in the implementation of the MATCH_CLASS instruction
    is the following comment:
    https://github.com/brandtbucher/cpython/blob/patma/Python/ceval.c#L981

    Such complex control flow should be handled during compilation, rather
    than in the interpreter.
    Giant instructions like MATCH_CLASS are likely to have odd corner cases,
    and may well have a negative impact on the performance of the rest of
    the language.
    It is a lot easier to reason about a sequence of simple bytecodes, than
    one giant one with context-dependent behaviour.

    We have spent quite a lot of effort over the last few years
    streamlining
    the interpreter.
    Adding these extremely complex instructions would be a big backward
    step.

What you say about this sounds very reasonable to me, although I don'tunderstand who it is a "Concern with PEP 634". The implementation is notpart of the PEP, and I'm pretty sure all parties here (the PEP authors,the PEP critics, the SC) would be more than happy to see thisimplementation improving.

You say that the implementation is not part of the PEP, but what happensif the PEP is accepted?The implementation then become parts of CPython, which is the de-factospecification of Python, and because the PEP is not precisely defined,the implementation becomes the de-factor specification of patternmatching in Python.


You say that you would be happy to see the implementation improving.
Are you going to do it? If not, then who?

If you can't do it now, then at least explain what optimisations areproposed.


Cheers,
Mark.

I'm pretty sure we both have presented these ideas before so I'm notexpecting any of us to change our minds at this point, but I'm leavingthis reply here for future reference in case other people want to havethe different positions on this discussion
Best,
D.

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K34246PBAEJDOHKMJX35UC2COYLC4OXZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Concerns about PEP 634

Reply via email to