On Nov 6, 2008, at 10:35 PM, Steve Holden wrote:

That's good to hear. Your arguments are sometimes pretty good, and
usually well made, but there's been far too much insistence on all sides
about being right and not enough on reaching agreement about how
Python's well-defined semantics for assignment and function calling
should best be described.

In other words, it's a classic communication problem.

That's a fair point.  I'll try to do better.

I must say I find it strange when people try to contradict my assertion that Python names are references to objects, when the (no pun intended)
reference implementation of the language uses "reference counting" to
track how many assignments have been made.

I agree.  It seems like we should be able to take that as a given.

So any argument that the language "doesn't have the concept of object
reference (in the sense of e.g. C++ reference)" is simply stating the
obvious: that Python has no way to declare reference variables. I would
argue myself that it has no need of such a mechanism precisely because
names are object references, and I'd like to hear counter-arguments.

Right. I think of it this way: every variable is an object reference; no special syntax needed for it because that's the only type of variable there is. (Just as with Java or .NET, when dealing with any class type; Python is just a little more extreme in that even simple things like numbers are wrapped in objects.)

Note: I tried to say "name" above instead of "variable" but I couldn't bring myself to do it -- "name" seems to generic to do that job. Lots of things have names that are not variables: modules have names, classes have names, methods have names, and so do variables. If I say "name," an astute listener would reasonably say "name of what" -- and I don't want to have to say "name of some thing in a name space which can be flexibly associated with an object" when the simple term "variable" seems to work as well.

Well that's not true either. If I remember all the way back to my
computational science degree I seem to remember being taught that there was call by *simple reference*, which is what I understand you to mean.
Suppose I write the following on some not-quite-Python language:

lst = ['one', 'two', 'three']

index = 1

def foo(item, i):
  i = 2
  item = "ouch"

foo(lst[index], index)
...
With call by simple reference, after the call I would expect the
following conditions to be true:

index == 2
lst == ['one', 'ouch', 'three']

Yes, I guess so, though it would require that lst[index] evaluate to an lvalue to which the 'item' parameter could be an alias. (With the second parameter, 'i', the situation is more straightforward because you're passing in a simple variable rather than a more complex expression.)

With full call by reference, however, arguably the change to the value
of index would induce the post-conditions

index == 2
lst == ['one', 'two', 'ouch']

because the reference made by the first argument depends on the value of
a variable mutated inside the function call.

I confess that I've never heard of "call by simple reference" or "call by full reference" before. What you're describing in the second case sounds more like call by name to me.

But I think we can agree that neither of these behaviors describes Python.

Why the resistance to these simple and basic terms that apply to any OOP
language?

Ideally I'd like to see this discussion concluded without resorting to
democratic appeals. Otherwise, after all, we should all eat shit: sixty
billion flies can't possibly be wrong.

I think I could make a good argument that the nutritional needs of flies are different from those of humans. On the other hand, what argument is there that the Python community should use its own unique terminology for concepts that apply equally well to other languages? Wouldn't communication be easier and smoother if we adopted standard terms for standard behavior?

What does "give a new name to an object" mean? I submit that it means
exactly the same thing as "assigns the name to refer to the object".

I normally internalize "x = 3" as meaning "store a reference to the
object 3 in the slot named x", and when I see "x" in an expression I
understand it to be a reference to some object, and that the value will
be used after dereferencing has taken place.

Works for me.

I've seen various descriptions of Python's name binding behavior in
terms of attaching Port-It notes bearing names to the objects reference
by the names, and I have never found them convincing. The reason for
this is that names live in namespaces, whereas values live in some other
universe altogether (that I normally describe as "object space" to
beginners, though this is not a term you will come across in the python
literature).

Agreed. That model implies that all names are global, and completely fails to explain how one object might be named "x" and a completely different object might also be "x" (albeit in a different namespace). I suppose your post-its could be color-coded by namespace, and then you could add additional warts and caveats and addendums to explain recursion, or explain why you don't have to search all objects in existence to find the right one every time a name is dereferenced, but the whole thing seems like a house of cards to me.

So I see the Post-it as being attached to a portion of some
namespace, and that little fixed-size piece of object space being
attached by a piece of string to a specific object. Of course any object
can have many piece of string attached, and not all of them come from
names -- some of them come from container elements, for example.

Right.

There certainly is no difference in behavior that anyone has been able to point out between what assignment does in Python, and what assignment does in RB, VB.NET, Java, or C++ (in the context of object pointers, of
course).  If the behavior is the same, why should we make up our own
unique and different terminology for it?

One reason would be that in the other languages you have other choices
as well, so you need to distinguish between them. Python is simpler, and so I don't see us needing the terminological complexity required in the
other contexts you name, for a start.

OK, that's a fair argument, and I do suspect this is a big part of it -- when your language clearly supports passing object references and other types by-ref and by-val, and you can easily demonstrate the difference, then there is little temptation to claim that it doesn't do either one. But if your language supports only one of these, and you have no choices about it and can't (within the language itself) compare and contrast that one against another, then it is easy to make all sorts of claims about what that one is.

But getting back to your point: is the standard terminology really more complex than whatever else we can come up with?

Java messed up the whole deal by
having different kinds of objects as a sacrifice to run-time speed,
thereby breeding a whole generation of programmers with little clue
about these matters, and the .NET environment also has to resort to
"boxing" and "unboxing" from time to time. I say away with comparisons
to such horrendously complex issues. One of the reasons for Python's
continue march towards world domination (allow me my fantasies) is its
consistent simplicity. Those last two words would be my candidate for
the definition of "Pythonicity".

I'm with you there. To me, the consistent simplicity is exactly this: all variables are object references, and these are always passed by value.

- the parameters of a function are local names for the call arguments

Agreed; they're not aliases of the call arguments.

They are actually names local to the function namespace, containing
references to the arguments. Some of those arguments were provided as
names, in which case the local name contains a copy of the reference
bound to the name provided as an argument. This is, however, merely a
degenerate case of the general instance, in which an expression is
provided as an argument and evaluated, yielding (a reference to) an
object which is then bound to the parameter name in the local namespace.

Quite right.

  (I guess 'pass by object' is a good name).

Well, I'm not sure why that would be.  What you've just described is
called "pass by value" in every other language.

Sigh. This surely can only be true if you insist that references are
themselves values. I hold that they are not.

Here's an example of the above, I guess. In a language that supports integers and doubles as simple types, stored directly in a variable, then it is an obvious generalization that in the case of an object type, the value is a reference to an object. (Then you can "dereference" such a value to get to the values stored within the object.) It is the only simple and consistent description of such a language (which includes Java, RB, and .NET, as well as C++ if you consider an object pointer equivalent to a reference in more modern languages.)

But Python doesn't have those simple types, so there is a temptation to try to skip this generalization and say that references are not values, but rather the values are the objects themselves (despite the dereferencing step that is still required to get any data out of them). Well, and of course in the case of immutable objects, there is very little observable difference between references and values.

However, it seems to me that when you start denying that the value of an object reference is a reference to an object, this is when you get led into a quagmire of contradictions. Perhaps I'm wrong and I just haven't explored that path far enough, because it appears dark and cobwebby to my eyes. I will try to give it a chance.

It seems so transparent to me that the parameters are copies of the references passed as arguments I find it difficult to understand how, or why, anyone would conceptualize it differently.

Now you seem to be saying the same thing I've been saying all along. But this really is called "pass by value" in at least RB, VB.NET, and Java. And that makes sense to me.

OK, so above you argue quite cogently that Python uses a reference- passing mechanism.

Yes, of course.

This make you insistence in the preceding paragraph on calling it "pass by value" a little stubborn.

Why?  Are you really meaning to insist that the RB/VB.NET example:

  Function GetAgeInDogYears(ByVal whom As Person) As Integer
    return whom.age * 7
  End Function

is not actually using a by-value parameter? Or that it's not passing an object reference?

Sigh again. You appear to want to have your cake and eat it. You are, if
effect, saying "there are no values in Python, only references",
completely ignoring the fact that it is semantically impossible to have
a reference without having something to *refer to*

Well of course. I'm pretty sure I've said repeatedly that Python variables refer to objects on the heap. (Please replace "heap" with "object space" if you prefer.) I'm only saying that Python variables don't contain any other type of value than references -- no integers or doubles, for example. This is unlike the other languages under discussion (and may be at the root of the confusion).

(which we in the Python world, in our usual sloppy way, often call "a value").

Yes, and as long as we're agreed that this is only a sloppy shorthand, I'm OK with it (especially in the case of immutable objects, where the distinction is irrelevant).

I suspect this may be at the root of our equally stubborn insistence
that calling this mechanism "pass by value" is inviting
misunderstanding. If we didn't want to eliminate misunderstanding we
would all have stopped replying to you long ago.

Ditto right back at you. :) So maybe here's the trouble: since all Python variables are references, there is no need to distinguish reference types from any other types (there aren't any other types). So, with the distinction gone, there is a strong temptation to gloss over the fact that they are references at all, and try to say that the variables directly contain their objects.

But it seems to me that this claim quickly breaks down -- even as you said yourself; you need instead some mental model that shows the variables as pointing to (tied to via strings, associated via a lookup table, or whatever) the objects, which exist in object space. In other words, they're references.

But continuing to attempt to gloss over that fact, when you come to parameter passing, you're then stuck trying to avoid describing it as call by value, since if you claim that what a variable contains is the object itself, then that doesn't fit (since clearly the object itself is not copied). You also have to describe the assignment operator as different from all other languages, since clearly that's not copying the object either.

So you end up in this (to me, very strange) state where you're making up new terms to describe the parameter behavior, and the assignment behavior, which behavior is exactly the same as any other modern OOP language. It makes it (again, IMHO) all seem very much more complex and mysterious than it really is. And this all results inevitably from trying to gloss over the fact that Python variables are references.

So, while I'm trying this path on for size (and will continue to mull it over further), please try on this approach: boldly admit that they're references, and embrace that fact. An assignment copies the RHS reference into the LHS variable, nothing more or less. A parameter copies the argument reference into the formal parameter, nothing more or less. And all this is exactly the same as in any other OOP language the reader is likely to know. Isn't that simple, clear, and far easier to explain?

Well, I started with Simula and SmallTalk back in 1973, so my experience may be a bit light. Sorry about that. This terminology wasn't made up by
Python beginners, but by the people who invented Python.

Was it? Has our BDFL weighed in on this terminology issue anywhere? So far, the only "official" words I've found related to this discussion are the ones plainly admitting that Python uses references (which some in this thread seem to want to deny, though not you Steve).

I believe they did so on the grounds that it's easier for beginners to understand
Python's semantics without having to reference too many similar in
theory but confusingly different in practice other environments.

I wonder if that could be tested systematically. Perhaps we could round up 20 newbies, divide them into two groups of 10, give each one a 1-page explanation either based on passing object references by- value, or passing values sort-of-kind-of-by-reference, and then check their comprehension by predicting the output of some code snippets. That'd be very interesting. It's hard for me to believe that the glossing-over-references approach really is easier for anybody, but maybe I'm wrong.

I would even argue that your confusion supports this argument. Your
understanding of Python is perfectly adequate, so get with the program
for Pete's sake!

In my case, my understanding of Python became clear only once I stopped listening to all the confusing descriptions here, and realized that Python is no different from other OOP languages I already knew.

Best,
- Joe

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to