On 1/31/14 10:42 PM, Steven D'Aprano wrote:
On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:

Why can't we call __init__ the constructor and __new__ the allocator?

__new__ constructs the object, and __init__ initialises it. What's wrong
with calling them the constructor and initialiser? Is this such a
difficult concept that the average programmer can't learn it?

I've met people who have difficulty with OOP principles, at least at
first. But once you understand the idea of objects, it isn't that hard to
understand the idea that:

- first, the object has to be created, or constructed, or allocated
   if you will;

- only then can it be initialised.

Thus, two methods. __new__ constructs (creates, allocates) a new object;
__init__ initialises it after the event.

(In hindsight, it was probably a mistake for Python to define two create-
an-object methods, although I expect it was deemed necessary for
historical reasons. Most other languages make do with a single method,
Objective-C being an exception with "alloc" and "init" methods.)



Earlier in this post, you wrote:

But that distinction [between __new__ and __init__] isn't useful in
most programs.

Well, I don't know about that. I guess it depends on what sort of objects
you're creating. If you're creating immutable objects, then the
distinction is vital. If you're subclassing from immutable built-ins, of
which there are a few, the distinction may be important. If you're using
the object-pool design pattern, the distinction is also vital. It's not
*rare* to care about these things.


The thing most people mean by "constructor" is "the method that gets
invoked right at the beginning of the object's lifetime, where you can
add code to initialize it properly."  That describes __init__.

"Most people". I presume you've done a statistically valid survey then
*wink*

It *better* describes __new__, because it is *not true* that __init__
gets invoked "right at the beginning of the object's lifetime". Before
__init__ is invoked, the object's lifetime has already begun, inside the
call to __new__. Excluding metaclass shenanigans, the object lifetime
goes:


Prior to the object existing:
- static method __new__ called on the class[1]
- __new__ creates the object[2]  <=== start of object lifetime

Within the object's lifetime:
- the rest of the __new__ method runs, which may perform arbitrarily
   complex manipulations of the object;
- __new__ exits, returning the object
- __init__ runs


So __init__ does not occur *right at the beginning*, and it is completely
legitimate to write your classes using only __new__. You must use __new__
for immutable objects, and you may use __new__ for mutable ones. __init__
may be used by convention, but it is entirely redundant.

I do not buy the argument made by some people that Python ought to follow
whatever (possibly inaccurate or misleading) terminology other languages
use. Java and Ruby have the exact same argument passing conventions as
Python, but one calls it "call by value" and the other "call by
reference", and neither is the same meaning of "call by value/reference"
as used by Pascal, C, Visual Basic, or other languages. So which
terminology should Python use? Both C++ and Haskell have "functors", but
they are completely different things. What Python calls a class method,
Java calls a static method. We could go on for days, just listing
differences in terminology.

In Python circles, using "constructor" for __new__ and "initialiser" for
__init__ are well-established. In the context of Python, they make good
sense: __new__ creates ("constructs") the object, and __init__
_init_ialises it. Missing the opportunity to link the method name
__init__ to *initialise* would be a mistake.

We can decry the fact that computer science has not standardised on a
sensible set of names for concepts, but on the other hand since the
semantics of languages differ slightly, it would be more confusing to try
to force all languages to use the same words for slightly different
concepts.

The reality is, if you're coming to Python from another language, you're
going to have to learn a whole lot of new stuff anyway, so having to
learn a few language-specific terms is just a small incremental cost. And
if you have no idea about other languages, then it is no harder to learn
that __new__ / __init__ are the constructor/initialiser than it would be
to learn that they are the allocator/constructor or preformulator/
postformulator.

I care about using the right terminology that will cause the least amount
of cognitive dissonance to users' understanding of Python, not whether
they have to learn new terminology, and in the context of Python's object
module, "constructor" and "initialiser" best describe what __new__ and
__init__ do.


My summary of our two views is this: I am trying to look at things from a typical programmer's point of view. The existence of __new__ is an advanced topic that many programmers never encounter. Taking a quick scan through some large projects (Django, edX, SQLAlchemy, mako), the ratio of __new__ implementations to __init__ implementations ranges from 0% to 1.5%, which falls into "rare" territory for me. Among programs less than 5000 lines long, I'm sure the number is indistinguishable from 0, though I'm sure someone will question my methodology here as well! :)

You are looking at things from an accurate-down-to-the-last-footnote detailed point of view (and have provided some footnotes!). That's a very valuable and important point of view. It's just not how most programmers approach the language.

We are also both trying to reduce cognitive dissonance, but again, you are addressing language mavens who understand the footnotes, and I am trying to help the in-the-trenches people who have never encountered __new__ and are wondering why people are using funny words for the code they are writing.

Another difference in our approach: do you name things based on how they work under the hood, or how they are used? I hope we can all agree that when writing a user-defined class, the code that in C++ or Java would go into the constructor, in Python typically goes in __init__. When I say that __init__ plays the role of constructor, again, I mean from the typical programmer's point of view when writing typical user-defined classes.

Finding names for things is hard, and it's impossible to please both ends of this spectrum.

--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to