[Python-Dev] Re: PEP 622 aspects

Koos Zevenhoven Sun, 19 Jul 2020 14:47:01 -0700

On Sun, Jul 19, 2020 at 3:00 PM Tobias Kohn <ko...@tobiaskohn.ch> wrote:


> Quoting Koos Zevenhoven <k7ho...@gmail.com>:
>
> > (1) Class pattern that does isinstance and nothing else.
> >
> > If I understand the proposed semantics correctly, `Class()` is
> equivalent to checking `isinstance(obj, Class)`, also when `__match_args__`
> is not present. However, if a future match protocol is allowed to override
> this behavior to mean something else, for example `Class() == obj`, then
> the plain isinstance checks won't work anymore! I do find `Class() == obj`
> to be a more intuitive and consistent meaning for `Class()` than plain
> `isinstance` is.
> >
> > Instead, the plain isinstance check would seem to be well described by a
> pattern like `Class(...)`. This would allow isinstance checks for any
> class, and there is even a workaround if you really want to refer to the
> Ellipsis object. This is also related to the following point.
> >
> > (2) The meaning of e.g. `Class(x=1, y=_)` versus `Class(x=1)`
> >
> > In the proposed semantics, cases like this are equivalent. I can see why
> that is desirable in many cases, although Class(x=1, ...)` would make it
> more clear. A possible improvement might be to add an optional element to
> `__match_args__` that separates optional arguments from required ones
> (although "optional" is not the same as "don't care").
>
>
> Please let me answer these two questions in reverse order, as I think it
> makes more sense to tackle the second one first.
>
Possibly. Although I do find (1) a more serious issue than (2). To not have
isinstance available by default in a consistent manner would definitely be
a problem in my opinion. But the way I proposed to solve (1) may affect the
user interpretations of (2).

> ***2. Attributes***
>
> There actually is an important difference between `Class(x=1, y=_)` and `
> Class(x=1)` and it won't do to just write `Class(x=1,...)` instead.  The
> form `Class(x=1, y=_)` ensures that the object has an attribute `y`.  In
> a way, this is where the "duck typing" is coming in.
>
Ok, that is indeed how the current class pattern match algorithm works
according to the current PEP 622. Let me rephrase the title of problem (2)
slightly to accommodate for this:

"(2) The meaning of e.g. `Class(x=1, y=_)` versus `Class(x=1)` (when the
object has attributes x, y and "x", "y" are in __match_arhs__)"

> The class of an object and its actual shape (i.e. the set of attributes it
> has) are rather loosely coupled in Python: there is usually nothing in the
> class itself that specifies what attributes an object has (other than the
> good sense to add these attributes in `__init__`).
>
Usually, it is bad practice to define classes whose interface is not or
cannot be specified. Python does, however, even allow you to make hacks
like tack an extra attribute to an object while it doesn't really "belong"
there.

> Conceptually, it therefore makes sense to not only support `isinstance`
> but also `hasattr`/`getattr` as a means to specify the shape/structure of
> an object.
>
> Here we agree (although not necessarily regarding "therefore").

> Let me give a very simple example from Python's `AST` module.  We know
> that compound statements have a field `body` (for the suite) and possibly
> even a field `orelse` (for the `else` part).  But there is no common
> superclass for compound statements.  Hence, although it is shared by
> several objects, you cannot detect this structure through `isinstance`
> alone.  By allowing you to explicitly specify attributes in patterns, you
> can still use pattern matching notwithstanding:
> ```
> *match* node:
>     *case* ast.stmt(body=suite, orelse=else_suite) if else_suite:
>         # a statement with a non-empty else-part
>         ...
>     *case* ast.stmt(body=suite):
>         # a compound statement without else-part
>         ...
>     *case* ast.stmt():
>         # a simple statement
>         ...
> ```
>
So this is an example of a combination of duck-typing and a class type. I
agree it's good to be able to have this type of matching available. I can
only imagine the thought process that led you to bring up this example, but
I feel that we got stuck on whether an attribute is present or not, which
is a side track regarding the issues I pointed out.

Python can be written in many ways, but I'm not sure that the above example
is representative of how duck typing usually works. I see a lot more
situations where you either care about isinstance or about some duck typing
pattern – usually not both.

> The very basic form of class patterns could be described as `C(a_1=P_1,
> a_2=P_2, ...)`, where `C` is a class to be checked through `isinstance`,
> and the `a_i` are attribute names to be extracted by means of `getattr`
> to then be matched against the subpatterns `P_i`.  In short: you specify
> the structure not only by class, but also by its actual structure in form
> of required attributes.
>
Ok, back on track now.  But this won't do, if we want to be able to access
isinstance for all classes by default. If this form is applied to all
classes, then no class will have anything different from that.  My version
was a bit different: to introduce the *very basic* form that is spelled
Class(...), and this would have the same meaning (isinstance) for ALL
classes.

> Particularly for very simple objects, it becomes annoying to specify the
> attribute names each time.  Take, for instance, the `Num`-expression from
> the AST.  It has just a single field `n` to hold the actual number.  But
> the AST objects also contain an attribute `_fields = ('n',)` that not
> only lists the *relevant* attributes, but also specifies an order.  It thus
> makes sense to introduce a convention that in `Num(x)` without argument
> name, the `x` corresponds to the first field `n`.  Likewise, you write 
> `UnarOp('+',
> item)` without the attribute names because `_fields=('op', 'operand')`
> already tells you what attributes are meant.  That is essentially the
> principle we adopted through introduction of `__match_args__`.
>
Makes sense (at least up to the last sentence – if that is the purpose, it
is not obvious to me that it should be called __match_args__).

>
> ***1. Match Protocol***
>
> I am not entirely sure what you mean by `C() == obj`.  In most cases you
> could not actually create an instance of `C` without some meaningful
> arguments for the constructor.
>
I mean exactly that – the case where, to match, the object needs to be
equal to C(). Constructing objects with no arguments is not uncommon at
all. Often it is an empty container, or in some sense the most basic form
of the object. Already in builtins, there are many examples: str(),
bytes(), dict(), int(), list(), tuple(), ...

> The idea of the match-protocol is very similar to how you can already
> override the behaviour of `isinstance`.  It is not meant to completely
> change the semantics of what is already there, but to allow you to
> customise it (in some exciting ways ^_^).  Of course, as with everything
> customisable, you could go off and do something funny with it, but if it
> then breaks, that's quite on you.
>
Agreed.

> On the caveat that this is ***not part of this PEP (!)***, let me try and
> explain why we would consider a match protocol in the first place.  The
> standard example to consider are complex numbers.
>
[snipped complex numbers example – in short, I agree that a general match
protocol for "class patterns" should not enforce an isinstance check.]

[...]

––Koos

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHAGN7CJULG6C3H44QTH54OOBDPQNCIA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 622 aspects

Reply via email to