On Fri, Dec 20, 2013 at 02:04:49AM -0500, Keith Winston wrote: > I am a little confused about class variables: I feel like I've repeatedly > seen statements like this:
I don't like the terms "class variable" and "instance variable". In the Python community, these are usually called class and instance attributes rather than variables or members. (Sometimes, people will call them "members", especially if they are used to C#. The meaning here is member as in an arm or leg, as in "dismember", not member in the sense of belonging to a group.) Normally, we say that a string variable is a variable holding a string, a float variable is a variable holding a float, an integer variable is a variable holding an integer. So a class variable ought to be a variable holding a class, and an instance variable ought to be a variable holding an instance. In Python we can have both of those things! Unlike Java, classes are "first-class citizens" and can be treated exactly the same as strings, floats, ints and other values. So a "class variable" would be something like this: for C in list_of_classes: # Here, C holds a class, and so we might call # it a "class variable", not a string variable do_something_with(variable) > There is only one copy of the class variable and when any one object makes a > change to a class variable, that change will be seen by all the other > instances. > Object variables are owned by each individual object/instance of the class. > In this case, each object has its own copy Talking about copies is not a good way to understand this. It might make sense to talk about copies in some other languages, but not in Python. (Or any of many languages with similar behaviour, like Ruby or Java.) I'm going to give you a simple example demonstrating why thinking about copies is completely the wrong thing to do here. If you already understand why "copies" is wrong, you can skip ahead here, but otherwise you need to understand this even though it doesn't directly answer your question. Given a simple class, we can set an attribute on a couple of instances and see what happens. Copy and paste these lines into a Python interactive session, and see if you can guess what output the print will give: class Test: pass spam = Test() eggs = Test() obj = [] spam.attribute = obj eggs.attribute = obj spam.attribute.append("Surprise!") print(eggs.attribute) If you think about *copies*, you might think that spam and eggs have their own independent copies of the empty list. But that's not what Python does. You don't have two copies of the list, you have a single list, and two independent references to it. (Actually, there are three: obj, spam.attribute, eggs.attribute.) But only one list, with three different names. This is similar to people. For instance, the President of the USA is known as "Mr President" to his staff, "POTUS" to the military, "Barrack" to his wife Michelle, "Mr Obama" to historians and journalists, "Dad" to his children, and so forth. But they all refer to the same person. In a few years, Barrack Obama will stand down as president, and somebody else will be known as "Mr President" and "POTUS", but he'll still be "Barrack" to Michelle. Python treats objects exactly the same. You can have lots of names for the same object. Some objects, like lists, can be modified in place. Other objects, like strings and ints, cannot be. In Python, we refer to this system as "name binding". You have things which are names, like "obj", and we associate an object to that name. Another term for this is a "reference", in the generic sense that we "refer" to things. So we can bind an object to a name: obj = [] We can *unbind* the name as well: del obj In Python, assignment with = is name binding, and not copying: spam.attribute = obj does not make a copy of the list, it just makes "spam.attribute" and "obj" two different names for the same list. And likewise for "eggs.attribute". Hopefully now you can understand why it is wrong to talk about "copies" here. In Python, you only get copies when you explicitly call a function which makes a copy, and never from = assignment (name binding). Now let me get back to your original question: > But when I test, I see some interesting things: first (and this is > consistent with above) the class variables are created when the class is > defined, and can be used even without any instances of the class being > created. Correct. Not only that, but class attributes will show up from instances as well: py> class Parrot: ... colour = "green" ... def description(self): ... return "You see a %s coloured bird." % self.colour ... py> polly = Parrot() py> polly.description() 'You see a green coloured bird.' > Second, initially confusing but maybe I understand... there are pointers to > the class variables associated with every instance of the object, Don't think about pointers. That's too specific. It just so happens that the version of Python you are using *does* use pointers under the hood, but that's not always the case. For instance, Jython is written in Java, and IronPython is written in dot-Net's CLR. Neither of those languages have pointers, but they have *something* that will do the same job as a pointer. This is why we talk about references. The nature of the reference remains the same no matter what version of Python you use, regardless of how it works under the hood. Putting aside that, you're actually mistaken here about there being an association between the instance and class attribute. There is no association between the instance and the class attribute. (Or rather, no *direct* association. Of course there is an indirect association.) What actually happens is something rather like this: Suppose we ask Python for "polly.colour". Python looks at the instance polly, and checks to see if it has an instance attribute called "polly". If it does, we're done. But if it doesn't, Python doesn't give up straight away, it next checks the class of polly, which is Parrot. Does Parrot have an attribute called "polly"? Yes it does, so that gets returned. The actual process is quite complicated, but to drastically over-simplify, Python will check: - the instance - the class - any super-classes of the class and only raise an exception if none of these have an attribute of the right name. > but if I > assign THOSE variables new values, it crerates new, "local"/instance > variables. When you ask for the polly.colour attribute, Python will search the instance, the class, and any super-classes for a match. What happens when you try to assign an attribute? py> polly.colour = 'red' py> polly.description() 'You see a red coloured bird.' py> Parrot.colour 'green' The assignment has created a new name-binding, creating the instance attribute "colour" which is specific to that one instance, polly. The class attribute remains untouched, as would any other instances (if we had any). No copies are made. So unlike *getting* an attribute, which searches both the instance and the class, *setting* or *deleting* an attribute stops at the instance. I like to describe this as class attributes are *shared*. Unless shadowed by an instance attribute of the same name, a class attribute is seen by all instances and its content is shared by all. Instance attributes, on the other hand, are distinct. > So: > Class.pi == 3.14 # defined/set in the class def > instance.pi == 3.14 # initially > instance.pi = 4 # oops, changed it > Class.pi == 3.14 # still > Class.pi = "rhubarb" # oops, there I go again > instance.pi == 4 # still > > Sorry if I'm beating this to a pulp, I think I've got it... I'm just > confused because the way they are described feels a little confusing, but > maybe that's because I'm not taking into account how easy it is to create > local variables... Local variables are a whole different thing again. Another reason why I dislike the habit of calling these things "instance variables", borrowed from languages like Java. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor