[Python-ideas] Re: Auto assignment of attributes

2022-04-30 Thread Christopher Barker
On Sat, Apr 30, 2022 at 6:40 PM Steven D'Aprano  wrote:

> On Sat, Apr 23, 2022 at 12:11:07PM -0700, Christopher Barker wrote:
> > Absolutely. However, this is not an "all Classes" question.
>
> Isn't it? I thought this was a proposal to allow any class to partake in
> the dataclass autoassignment feature.
>

no -- it's about only a small part of that.


> > I don't think of dataclasses as "mutable namedtuples with defaults" at
> all.
> What do you think of them as?
>

I answered that in the next line, that you quote.

>
> > But do think they are for classes that are primarily about storing a
> > defined set of data.
>
> Ah, mutable named tuples, with or without defaults? :-)
>

well, no. - the key is that you can add other methods to them, and produce
all sort of varyingly complex functionality. I have done that myself.


> Or possibly records/structs.
>

nope, nope, and nope.

But anyway, the rest of my post was the real point, and we're busy arguing
semantics here.

-CHB

-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FASJHCH5OV453YGZAFAJ5GHGKS45MVII/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-04-30 Thread Steven D'Aprano
On Sat, Apr 23, 2022 at 12:11:07PM -0700, Christopher Barker wrote:
> On Sat, Apr 23, 2022 at 10:53 AM Pablo Alcain  wrote:
> 
> > Overall, I think that not all Classes can be thought of as Dataclasses
> > and, even though dataclasses solutions have their merits, they probably
> > cannot be extended to most of the other classes.
> >
> 
> Absolutely. However, this is not an "all Classes" question.

Isn't it? I thought this was a proposal to allow any class to partake in 
the dataclass autoassignment feature.

(Not necessarily the implementation.)


> I don't think of dataclasses as "mutable namedtuples with defaults" at all.

What do you think of them as?


> But do think they are for classes that are primarily about storing a
> defined set of data.

Ah, mutable named tuples, with or without defaults? :-)

Or possibly records/structs.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T2AWTW54AW5SNJSNDCZ6YNK2T6QWNLQT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-04-30 Thread Pablo Alcain
On Sat, Apr 23, 2022, 1:11 PM Christopher Barker 
wrote:

> On Sat, Apr 23, 2022 at 10:53 AM Pablo Alcain 
> wrote:
>
>> Overall, I think that not all Classes can be thought of as Dataclasses
>> and, even though dataclasses solutions have their merits, they probably
>> cannot be extended to most of the other classes.
>>
>
> Absolutely. However, this is not an "all Classes" question.
>
> I don't think of dataclasses as "mutable namedtuples with defaults" at all.
>

Although I agree that dataclasses have definitely grown beyond this scope,
the definition of “mutable namedtuples with defaults” come from the
original PEP (https://peps.python.org/pep-0557/#abstract). The main point
here is that there are several usecases for classes that do not fit
conceptually the “dataclass” goal.


> But do think they are for classes that are primarily about storing a
> defined set of data.
>
> I make heavy use of them for this, when I am adding quite a bit of
> ucntionatily, but their core function is still to store a collection of
> data. To put it less abstractly:
>
> Dataclasses are good for classes in which the collection of fields is a
> primary focus -- so the auto-generated __init__, __eq__ etc are appropriate.
>
> It's kind of a recursive definition: dataclasses work well for those
> things that data classes' auto generated methods work well for :-)
>
> If, indeed, you need a lot of custom behavior for teh __init__, and
> __eq__, and ... then datclasses are not for you.
>

I agree 100%. This proposal, at its core, is not related with dataclasses.
There are some cases in which dataclasses are the solution, but there are
many many times in which you will want to use just classes.

>
> And the current Python class system is great for fully customized
> behaviour. It's quite purposeful that parameters of the __init__ have no
> special behavior, and that "self" is explicit -- it gives you full
> flexibility, and everything is explicit. That's a good thing.
>
> But, of course, the reason this proposal is on the table (and it's not the
> first time by any means) is that it's a common pattern to assign (at least
> some of) the __init__ parameters to instance attributes as is.
>
> So we have two extremes -- on one hand:
>
> A) Most __init__ params are assigned as instance attributes as is, and
> these are primarily needed for __eq__ and __repr__
>
> and on the other extreme:
>
> B) Most __init__ params need specialized behavior, and are quite distinct
> from what's needed by __eq__ and __repr__
>
> (A) is, of course, the entire point of dataclasses, so that's covered.
>
> (B) is well covered by the current, you-need-to-specify-everything
> approach.
>

I don’t see B as a “extreme approach”. I think that comparing python
classes with the specific dataclass is not helpful. The B scenario is
simply the general case for class usage. Scenario A, I agree, is a very
common one and fortunately we have dataclasses for them.


> So the question is -- how common is it that you have code that's far
> enough toward the (A) extreme as far as __init__ params being instance
> attributes that we want special syntax, when we don't want most of the
> __eq__ and __repr__ behaviour.
>

I agree that this is the main question. For what it’s worth, a quick grep
on the stdlib (it’s an overestimation) provides:

$ grep -Ie "self\.\(\w\+\) = \1" -r cpython/Lib | wc
2095

I did the same in two libraries that I use regularly: pandas and
scikit-learn:

$ grep -Ie "self\.\(\w\+\) = \1" -r sklearn | wc -l
1786

$ grep -Ie "self\.\(\w\+\) = \1" -r pandas | wc -l
650

That’s a total of ~4.5k lines of code (again, this is an overestimation,
but it can give us an idea of the ballpark estimate)

For a better and more fine-grained analysis, Quimey wrote this small
library (https://github.com/quimeyps/analize_autoassign) that uses the
Abstract Syntax Tree to analyze a bunch of libraries and identify when the
“autoassign” could work. It shows that out of 20k analyzed classes in the
selected libraries (including black, pandas, numpy, etc), ~17% of them
could benefit from the usage of auto-assign syntax.

So it looks like the isolated pattern of `self. = `
is used a lot. I don’t think that moving all of these cases to dataclasses
can provide a meaningful solution. When I take a look at these numbers (and
reflect in my own experience and my colleagues) it looks like there is a
use case for this feature. And this syntax modification looks small and
kind of clean, not adding any boilerplate. But, obviously, it entails a
further of discussion whether it makes sense to add new syntax for this,
considering the maintenance that it implies.


> In my experience, not all that much -- my code tends to be on one extreme
> or the other.
>
> But I think that's the case that needs to be made -- that there's a lot of
> use cases for auto-assigning instance attributes, that also need
> highly customized behaviour for other attributes and __eq__  and __repr__.
>
> NOTE: another key