Hi Brandt,
On 30/03/2021 5:25 pm, Brandt Bucher wrote:
Overall, I am still uncomfortable with PEP 653, and would probably not support
its acceptance.
Although it has thankfully become a much less radical proposal than it was a
few weeks ago (thanks, Mark, for your attention to our feedback), I feel that
the rules it binds implementations to are *very* premature, and that the new
mechanisms it introduces to do so only modestly improve potential performance
at great expense to the ease of learning, using, and maintaining code using
structural pattern matching.
A few notes follow:
For example, using `sympy`, we might want to write:
```
# a*a == a**2
case Mul(args=[a, b]) if a == b:
return Pow(a, 2)
```
Which requires the sympy class `Symbol` to "self" match. For `sympy` to support
this pattern with PEP 634 is possible, but a bit tricky. With this PEP it can be
implemented very easily.
Maybe I'm missing something, but I don't understand at all how the provided
code snippet relies on the self-matching behavior.
No I'm missing something.
That should read
case Mul(args=[Symbol(a), Symbol(b)) if a == b:
...
I'll fix that in the PEP thanks.
Have the maintainers of SymPy (or any large library supposedly benefitting here) come out
in support of the PEP? Are they at least aware of it? Have they indicated that the
proposed idiom for implementing self-matching behavior using a property is truly too
"tricky" for them?
Have you identified any stdlib classes that would benefit greatly from this?
For me, `__match_class__` feels like a feature without demonstrated need. Even
if there is a great demand for this, I certainly think that there are far
better options than the proposed flagging system:
The distinction between those classes that have the default behavior and
those that match "self" is from PEP 634. I didn't introduce it.
I'm just proposing a more principled way to make that distinction.
- A `@match_self` class decorator (someone's bound to put one on PyPI, at any
rate).
- Allowing `__match_args__ = None` to signal this case (an option we previously
considered, and my personal preference).
...both of which can be added later, if needed.
Further, PEP 634 makes it very easy for libraries to support Python versions
with *and* without pattern matching (something I consider to be an important
requirement). The following class works with both 3.9 and 3.10:
```
class C(collections.abc.Sequence):
...
```
While something like this is required for PEP 653:
```
class C:
if sys.version_info >= (3, 10):
from somewhere import MATCH_SEQUENCE
__match_container__ = MATCH_SEQUENCE
...
```
Or
class C:
__match_container__ = 1 # MATCH_SEQUENCE
Which is one reason the PEP states that the values of MATCH_SEQUENCE,
etc. will never change.
PEP 634 relies on the `collections.abc` module when determining which patterns
a value can match, implicitly importing it if necessary. This PEP will
eliminate surprising import errors and misleading audit events from those
imports.
I think that a broken `_collections_abc` module *should* be surprising. Is
there any reasonable scenario where it's expected to not exist, or be not be
fit for this purpose?
No reasonable scenario, but unreasonable scenarios happen all too often.
And I'm not sure how an audit event for an import that is happening could be considered
"misleading"... I certainly wouldn't want it suppressed.
It's misleading because a match statement doesn't include any explicit
imports.
Looking up a special attribute is much faster than performing a subclass test
on an abstract base class.
How much faster? A quick benchmark on my machine suggests less than half a
microsecond. PEP 634 (like PEP 653) already allows us to cache this information
for the subject of a match statement, so I doubt that this is actually a real
issue in practice. An indeed, with the current implementation, this test isn't
even performed on the most common types, such as lists, tuples, and
dictionaries.
Half a microsecond is thousands of instructions on a modern CPU.
That is a long time for a single VM operation.
At the very least, PEP 653's confusing new flag system seems to be a *very*
premature optimization, seriously hurting usability for a modest performance
increase. (Using them wrongly also seems to introduce a fair amount of
undefined behavior, which seems to go against the PEP's own motivation.)
Why do you say it is a premature optimization?
It's primary purpose is reliability and precise semantics.
It is more optimizable, I agree, but that is hardly premature.
You also say it is confusing, but I think it is simpler than the
workarounds to match "self" that you propose.
This is very subjective though. Evidently we think differently.
If the value of `__match_args__` is not as specified, then the implementation
may raise any exception, or match the wrong pattern.
I think there's a name for this sort of behavior... ;)
Indeed, but there is only undefined behavior if a class violates clearly
specified rules. The undefined behavior in PEP 634 is much broader.
We already tolerate some amount of undefined behavior. For example,
dictionary lookup is also undefined for classes which do not hash properly.
A couple of other, more technical notes:
- PEP 653 requires mappings to have a `keys()` method that returns an object
supporting set inequality operations. It is not really that common to find this
sort of support in user code (in my experience, it is more likely that
user-defined `keys()` methods will return iterables). It's not even clear to me
if this is an interface requirement for mappings in general. For example,
`weakref.WeakKeyDictionary` and `weakref.WeakValueDictionary` presently do not
work with PEP 653's requirements for mapping patterns, since their `keys()`
methods return iterators.
- Treating `__getitem__` as pure is problematic for some common classes (such
as `defaultdict`). That's why we use two-argument `get()` instead.
Thanks for pointing that out.
I'd noted that PEP 634 used `get()`, which is why I inserted the guard
on keys beforehand. Clearly that is insufficient.
I'll update the semantics to use the two-argument `get()`. It does seem
more robust.
As well-fleshed out as the pseudocode for the matching operations in this PEP
may be, examples like this suggest that perhaps we should wait until 3.11 or
later to figure out what actually works in practice and what doesn't. PEP 634
took a full year of work, and the ideas it proposed changed substantially
during that time (in no small part because we had many people experimenting
with how an actual implementation interacted with real code).
I fully understand that a lot of work went into PEP 634.
Which is why I am using the syntax and much of the semantics of PEP 634
as I can.
The test suite is proving very useful, thanks.
The problem with waiting for 3.11 is that code will start to rely on
some of the implementation details of pattern matching as it is now,
and that our ability to optimize it will be delayed by a year.
Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2IANQRLB4JR7RBNOSI2PLBQZP4HTMU54/
Code of Conduct: http://python.org/psf/codeofconduct/