Good proposal! I have a few questions. On Mon, Mar 15, 2021 at 2:22 PM Eric V. Smith <e...@trueblade.com> wrote:
> [I'm sort of loose with the terms field, parameter, and argument here. > Forgive me: I think it's still understandable. Also I'm not specifying > types here, I'm using Any everywhere. Use your imagination and > substitute real types if it helps you.] > > Here's version 2 of my proposal: > > There have been many requests to add keyword-only fields to dataclasses. > These fields would result in __init__ parameters that are keyword-only. > > In a previous proposal, I suggested also including positional arguments > for dataclasses. That proposal is at > > https://mail.python.org/archives/list/python-ideas@python.org/message/I3RKK4VINZUBCGF2TBJN6HTDV3PVUEUQ/ > . After some discussion, I think it's clear that positional arguments > aren't going to work well with dataclasses. The deal breaker for me is > that the generated repr would either not work with eval(), or it would > contain fields without names (since they're positional). There are > additional concerns mentioned in that thread. Accordingly, I'm going to > drop positional arguments from this proposal. > > Basically, I want to add a flag to each field, stating whether the field > results in a normal parameter or a keyword-only parameter to __init__. > Then when I'm generating __init__, I'll examine those flags and put the > normal arguments first, followed by the keyword-only ones. > > The trick becomes: how do you specify what type of parameter each field > represents? > > > What attrs does > --------------- > > First, here's what attrs does. There's a parameter to their attr.ib() > function (the moral equivalent of dataclasses.field()) named kw_only, > which if set, marks the field as being keyword-only. From > https://www.attrs.org/en/stable/examples.html#keyword-only-attributes : > > >>> @attr.s > ... class A: > ... a = attr.ib(kw_only=True) > >>> A() > Traceback (most recent call last): > ... > TypeError: A() missing 1 required keyword-only argument: 'a' > >>> A(a=1) > A(a=1) > > There's also a parameter to attr.s (the equivalent of > dataclasses.dataclass), also named kw_only, which if true marks every > field as being keyword-only: > > >>> @attr.s(kw_only=True) > ... class A: > ... a = attr.ib() > ... b = attr.ib() > >>> A(1, 2) > Traceback (most recent call last): > ... > TypeError: __init__() takes 1 positional argument but 3 were given > >>> A(a=1, b=2) > A(a=1, b=2) > > > dataclasses proposal > -------------------- > > I propose to adopt both of these methods (dataclass(kw_ony=True) and > field(kw_only=True) in dataclasses. The above example would become: > > >>> @dataclasses.dataclass > ... class A: > ... a: Any = field(kw_only=True) > > >>> @dataclasses.dataclass(kw_only=True) > ... class A: > ... a: Any > ... b: Any > > But, I'd also like to make this a little easier to use, especially in > the case where you're defining a dataclass that has some normal fields > and some keyword-only fields. Using the attrs approach, you'd need to > declare the keyword-only fields using the "=field(kw_only=True)" syntax, > which I think is needlessly verbose, especially when you have many > keyword-only fields. > > The problem is that if you have 1 normal parameter and 10 keyword-only > ones, you'd be forced to say: > > @dataclasses.dataclass > class LotsOfFields: > a: Any > b: Any = field(kw_only=True, default=0) > c: Any = field(kw_only=True, default='foo') > d: Any = field(kw_only=True) > e: Any = field(kw_only=True, default=0.0) > f: Any = field(kw_only=True) > g: Any = field(kw_only=True, default=()) > h: Any = field(kw_only=True, default='bar') > i: Any = field(kw_only=True, default=3+4j) > j: Any = field(kw_only=True, default=10) > k: Any = field(kw_only=True) > > That's way too verbose for me. > > Ideally, I'd like something like this example: > > @dataclasses.dataclass > class A: > a: Any > # pragma: KW_ONLY > b: Any > > And then b would become a keyword-only field, while a is a normal field. > But we need some way of telling dataclasses.dataclass what's going on, > since obviously pragmas are out. > > I propose the following. I'll add a singleton to the dataclasses module: > KW_ONLY. When scanning the __attribute__'s that define the fields, a > field with this type would be ignored, except for assigning the kw_only > flag to fields declared after these singletons are used. So you'd get: > > @dataclasses.dataclass > class B: > a: Any > _: dataclasses.KW_ONLY > b: Any > > This would generate: > > def __init__(self, a, *, b): > > This example is equivalent to: > > @dataclasses.dataclass > class B: > a: Any > b: Any = field(kw_only=True) > > The name of the KW_ONLY field doesn't matter, since it's discarded. I > think _ is a fine name, and '_: dataclasses.KW_ONLY' would be the > pythonic way of saying "the following fields are keyword-only". > > My example above would become: > > @dataclasses.dataclass > class LotsOfFields: > a: Any > _: dataclasses.KW_ONLY > b: Any = 0 > c: Any = 'foo' > d: Any > e: Any = 0.0 > f: Any > g: Any = () > h: Any = 'bar' > i: Any = 3+4j > j: Any = 10 > k: Any > > Which I think is a lot clearer. > > The generated __init__ would look like: > > def __init__(self, a, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', > i=3+4j, j=10, k): > > The idea is that all normal argument fields would appear first in the > class definition, then all keyword argument fields. This is the same > requirement as in a function definition. There would be no switching > back and forth between the two types of fields: once you use KW_ONLY, > all subsequent fields are keyword-only. A field of type KW_ONLY can > appear only once in a particular dataclass (but see the discussion below > about inheritance). > > > Re-ordering args in __init__ > ---------------------------- > > If, using field(kw_only=True), you specify keyword-only fields before > non-keyword-only fields, all of the keyword-only fields will be moved to > the end of the __init__ argument list. Within the list of > non-keyword-only arguments, all arguments will keep the same relative > order as in the class definition. Ditto for within keyword-only arguments. > > So: > > @dataclasses.dataclass > class C: > a: Any > b: Any = field(kw_only=True) > c: Any > d: Any = field(kw_only=True) > > Then the generated __init__ will look like: > > def __init__(self, a, c, *, b, d): > > __init__ is the only place where this rearranging will take place. > Everywhere else, and importantly in __repr__ and any dunder comparison > methods, the order will be the same as it is now: in field declaration > order. > Can you be specific and show what the repr() would be? E.g. if I create C(1, 2, b=3, d=4) the repr() be C(a=1, b=3, c=2, d=4), right? > This is the same behavior that attrs uses. > Nevertheless I made several typos trying to make the examples in my sentence above correct. Perhaps we could instead disallow mixing kw-only and regular args? Do you know why attrs does it this way? > Inheritance > ----------- > > There are a few additional quirks involving inheritance, but the > behavior would follow naturally from how dataclasses already handles > fields via inheritance and the __init__ argument re-ordering discussed > above. Basically, all fields in a derived class are computed like they > are today. Then any __init__ argument re-ordering will take place, as > discussed above. > > Consider: > > @dataclasses.dataclass(kw_only=True) > class D: > a: Any > > @dataclasses.dataclass > class E(D): > b: Any > > @dataclasses.dataclass(kw_only=True) > class F(E): > c: Any > > This will result in the __init__ signature of: > > def __init__(self, b, *, a, c): > > However, the repr() will still produce the fields in order a, b, c. > Comparisons will also use the same order. > This can be simulated by flattening the inheritance tree and adding explicit field(kw_only=True) to all fields of classes using kw_only=True in the class decorator as well as all fields affected by _: KW_ONLY, right? So the above would behave like this: @dataclasses.dataclass class F: a: Any = field(kw_only=True) b: Any c: Any = field(kw_only=True) which IIUC indeed gives the same __init__ signature and repr(). > Conclusion > ---------- > > Remember, the only point of all of these hoops is to add a flag to each > field saying what type of __init__ argument it becomes: normal or > keyword-only. Any of the 3 methods discussed above (kw_only flag to > @dataclass(), kw_only flag to field(), or the KW_ONLY marker) all have > the same result: setting the kw_only flag on one or more fields. > > The value of that flag, on a per-field basis, is used to re-order > __init__ arguments, and is used in generating the __init__ signature. > It's not used anywhere else. > > I expect the two most common use cases to be the kw_only flag to > @dataclass() and the KW_ONLY marker. I would expect the usage of the > kw_only flag on field() to be rare, but since it's the underlying > mechanism and it's needed for more complex field layouts, it is included > in this proposal. > > So, what do you think? Is this a horrible idea? Should it be a PEP, or > just a 'simple' feature addition to dataclasses? I'm worried that if I > have to do a full blown PEP I won't get to this for 3.10. > I don't think it is very controversial, do you? Then again maybe you should ask a SC member if they would object. mypy and other type checkers would need to be taught about all of this. > Yeah, that's true. But the type checkers have bigger fish to fry (e.g. pattern matching). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AZN66ZACAH6BGX2OGDQI7GV3X6SETRUP/ Code of Conduct: http://python.org/psf/codeofconduct/