Good proposal! I have a few questions.

On Mon, Mar 15, 2021 at 2:22 PM Eric V. Smith <e...@trueblade.com> wrote:

> [I'm sort of loose with the terms field, parameter, and argument here.
> Forgive me: I think it's still understandable. Also I'm not specifying
> types here, I'm using Any everywhere. Use your imagination and
> substitute real types if it helps you.]
>
> Here's version 2 of my proposal:
>
> There have been many requests to add keyword-only fields to dataclasses.
> These fields would result in __init__ parameters that are keyword-only.
>
> In a previous proposal, I suggested also including positional arguments
> for dataclasses. That proposal is at
>
> https://mail.python.org/archives/list/python-ideas@python.org/message/I3RKK4VINZUBCGF2TBJN6HTDV3PVUEUQ/
> . After some discussion, I think it's clear that positional arguments
> aren't going to work well with dataclasses. The deal breaker for me is
> that the generated repr would either not work with eval(), or it would
> contain fields without names (since they're positional). There are
> additional concerns mentioned in that thread. Accordingly, I'm going to
> drop positional arguments from this proposal.
>
> Basically, I want to add a flag to each field, stating whether the field
> results in a normal parameter or a keyword-only parameter to __init__.
> Then when I'm generating __init__, I'll examine those flags and put the
> normal arguments first, followed by the keyword-only ones.
>
> The trick becomes: how do you specify what type of parameter each field
> represents?
>
>
> What attrs does
> ---------------
>
> First, here's what attrs does. There's a parameter to their attr.ib()
> function (the moral equivalent of dataclasses.field()) named kw_only,
> which if set, marks the field as being keyword-only. From
> https://www.attrs.org/en/stable/examples.html#keyword-only-attributes :
>
>  >>> @attr.s
> ... class A:
> ...     a = attr.ib(kw_only=True)
>  >>> A()
> Traceback (most recent call last):
>    ...
> TypeError: A() missing 1 required keyword-only argument: 'a'
>  >>> A(a=1)
> A(a=1)
>
> There's also a parameter to attr.s (the equivalent of
> dataclasses.dataclass), also named kw_only, which if true marks every
> field as being keyword-only:
>
>  >>> @attr.s(kw_only=True)
> ... class A:
> ...     a = attr.ib()
> ...     b = attr.ib()
>  >>> A(1, 2)
> Traceback (most recent call last):
>    ...
> TypeError: __init__() takes 1 positional argument but 3 were given
>  >>> A(a=1, b=2)
> A(a=1, b=2)
>
>
> dataclasses proposal
> --------------------
>
> I propose to adopt both of these methods (dataclass(kw_ony=True) and
> field(kw_only=True) in dataclasses. The above example would become:
>
>  >>> @dataclasses.dataclass
> ... class A:
> ...     a: Any = field(kw_only=True)
>
>  >>> @dataclasses.dataclass(kw_only=True)
> ... class A:
> ...     a: Any
> ...     b: Any
>
> But, I'd also like to make this a little easier to use, especially in
> the case where you're defining a dataclass that has some normal fields
> and some keyword-only fields. Using the attrs approach, you'd need to
> declare the keyword-only fields using the "=field(kw_only=True)" syntax,
> which I think is needlessly verbose, especially when you have many
> keyword-only fields.
>
> The problem is that if you have 1 normal parameter and 10 keyword-only
> ones, you'd be forced to say:
>
> @dataclasses.dataclass
> class LotsOfFields:
>      a: Any
>      b: Any = field(kw_only=True, default=0)
>      c: Any = field(kw_only=True, default='foo')
>      d: Any = field(kw_only=True)
>      e: Any = field(kw_only=True, default=0.0)
>      f: Any = field(kw_only=True)
>      g: Any = field(kw_only=True, default=())
>      h: Any = field(kw_only=True, default='bar')
>      i: Any = field(kw_only=True, default=3+4j)
>      j: Any = field(kw_only=True, default=10)
>      k: Any = field(kw_only=True)
>
> That's way too verbose for me.
>
> Ideally, I'd like something like this example:
>
> @dataclasses.dataclass
> class A:
>      a: Any
>      # pragma: KW_ONLY
>      b: Any
>
> And then b would become a keyword-only field, while a is a normal field.
> But we need some way of telling dataclasses.dataclass what's going on,
> since obviously pragmas are out.
>
> I propose the following. I'll add a singleton to the dataclasses module:
> KW_ONLY. When scanning the __attribute__'s that define the fields, a
> field with this type would be ignored, except for assigning the kw_only
> flag to fields declared after these singletons are used. So you'd get:
>
> @dataclasses.dataclass
> class B:
>      a: Any
>      _: dataclasses.KW_ONLY
>      b: Any
>
> This would generate:
>
> def __init__(self, a, *, b):
>
> This example is equivalent to:
>
> @dataclasses.dataclass
> class B:
>      a: Any
>      b: Any = field(kw_only=True)
>
> The name of the KW_ONLY field doesn't matter, since it's discarded. I
> think _ is a fine name, and '_: dataclasses.KW_ONLY' would be the
> pythonic way of saying "the following fields are keyword-only".
>
> My example above would become:
>
> @dataclasses.dataclass
> class LotsOfFields:
>      a: Any
>      _: dataclasses.KW_ONLY
>      b: Any = 0
>      c: Any = 'foo'
>      d: Any
>      e: Any = 0.0
>      f: Any
>      g: Any = ()
>      h: Any = 'bar'
>      i: Any = 3+4j
>      j: Any = 10
>      k: Any
>
> Which I think is a lot clearer.
>
> The generated __init__ would look like:
>
> def __init__(self, a, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar',
> i=3+4j, j=10, k):
>
> The idea is that all normal argument fields would appear first in the
> class definition, then all keyword argument fields. This is the same
> requirement as in a function definition. There would be no switching
> back and forth between the two types of fields: once you use KW_ONLY,
> all subsequent fields are keyword-only. A field of type KW_ONLY can
> appear only once in a particular dataclass (but see the discussion below
> about inheritance).
>
>
> Re-ordering args in __init__
> ----------------------------
>
> If, using field(kw_only=True), you specify keyword-only fields before
> non-keyword-only fields, all of the keyword-only fields will be moved to
> the end of the __init__ argument list. Within the list of
> non-keyword-only arguments, all arguments will keep the same relative
> order as in the class definition. Ditto for within keyword-only arguments.
>
> So:
>
> @dataclasses.dataclass
> class C:
>      a: Any
>      b: Any = field(kw_only=True)
>      c: Any
>      d: Any = field(kw_only=True)
>
> Then the generated __init__ will look like:
>
> def __init__(self, a, c, *, b, d):
>
> __init__ is the only place where this rearranging will take place.
> Everywhere else, and importantly in __repr__ and any dunder comparison
> methods, the order will be the same as it is now: in field declaration
> order.
>

Can you be specific and show what the repr() would be? E.g. if I create
C(1, 2, b=3, d=4) the repr() be C(a=1, b=3, c=2, d=4), right?


> This is the same behavior that attrs uses.
>

Nevertheless I made several typos trying to make the examples in my
sentence above correct. Perhaps we could instead disallow mixing kw-only
and regular args? Do you know why attrs does it this way?


> Inheritance
> -----------
>
> There are a few additional quirks involving inheritance, but the
> behavior would follow naturally from how dataclasses already handles
> fields via inheritance and the __init__ argument re-ordering discussed
> above. Basically, all fields in a derived class are computed like they
> are today. Then any __init__ argument re-ordering will take place, as
> discussed above.
>
> Consider:
>
> @dataclasses.dataclass(kw_only=True)
> class D:
>      a: Any
>
> @dataclasses.dataclass
> class E(D):
>      b: Any
>
> @dataclasses.dataclass(kw_only=True)
> class F(E):
>      c: Any
>
> This will result in the __init__ signature of:
>
> def __init__(self, b, *, a, c):
>
> However, the repr() will still produce the fields in order a, b, c.
> Comparisons will also use the same order.
>

This can be simulated by flattening the inheritance tree and adding
explicit field(kw_only=True) to all fields of classes using kw_only=True in
the class decorator as well as all fields affected by _: KW_ONLY, right? So
the above would behave like this:

@dataclasses.dataclass
class F:
    a: Any = field(kw_only=True)
    b: Any
    c: Any = field(kw_only=True)

which IIUC indeed gives the same __init__ signature and repr().


> Conclusion
> ----------
>
> Remember, the only point of all of these hoops is to add a flag to each
> field saying what type of __init__ argument it becomes: normal or
> keyword-only. Any of the 3 methods discussed above (kw_only flag to
> @dataclass(), kw_only flag to field(), or the KW_ONLY marker) all have
> the same result: setting the kw_only flag on one or more fields.
>
> The value of that flag, on a per-field basis, is used to re-order
> __init__ arguments, and is used in generating the __init__ signature.
> It's not used anywhere else.
>
> I expect the two most common use cases to be the kw_only flag to
> @dataclass() and the KW_ONLY marker. I would expect the usage of the
> kw_only flag on field() to be rare, but since it's the underlying
> mechanism and it's needed for more complex field layouts, it is included
> in this proposal.
>
> So, what do you think? Is this a horrible idea? Should it be a PEP, or
> just a 'simple' feature addition to dataclasses? I'm worried that if I
> have to do a full blown PEP I won't get to this for 3.10.
>

I don't think it is very controversial, do you? Then again maybe you should
ask a SC member if they would object.

mypy and other type checkers would need to be taught about all of this.
>

Yeah, that's true. But the type checkers have bigger fish to fry (e.g.
pattern matching).

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AZN66ZACAH6BGX2OGDQI7GV3X6SETRUP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to