On Apr 25, 8:01 pm, Steven D'Aprano <steve +comp.lang.pyt...@pearwood.info> wrote: > On Wed, 25 Apr 2012 13:49:24 -0700, Adam Skutt wrote: > > Though, maybe it's better to use a different keyword than 'is' though, > > due to the plain English > > connotations of the term; I like 'sameobj' personally, for whatever > > little it matters. Really, I think taking away the 'is' operator > > altogether is better, so the only way to test identity is: > > id(x) == id(y) > > Four reasons why that's a bad idea: > > 1) The "is" operator is fast, because it can be implemented directly by > the interpreter as a simple pointer comparison (or equivalent). The id() > idiom is slow, because it involves two global lookups and an equality > comparison. Inside a tight loop, that can make a big difference in speed.
The runtime can optimize the two operations to be equivalent, since they are logically equivalent operations. If you removed 'is', there's little reason to believe it would do otherwise. > > 2) The "is" operator always has the exact same semantics and cannot be > overridden. The id() function can be monkey-patched. > I can't see how that's useful at all. Identity is a fundamental property of an object; hence retrieval of it must be a language operation. The fact Python chooses to do otherwise is unfortunate, but also irrelevant to my position. > 3) The "is" idiom semantics is direct: "a is b" directly tests the thing > you want to test, namely whether a is b. The id() idiom is indirect: > "id(a) == id(b)" only indirectly tests whether a is b. The two expressions are logically equivalent, so I don't see how this matters, nor how it is true. > > 4) The id() idiom already breaks if you replace names a, b with > expressions: > > >>> id([1,2]) == id([3,4]) > > True It's not broken at all. The lifetime of temporary objects is intentionally undefined, and that's a /good/ thing. What's unfortunate is that CPython optimizes temporaries differently between the two logically equivalent expressions. As long as this holds: >>> class A(object): ... def __del__(self): ... print "Farewell to: %d" % id(self) ... >>> A() is A() Farewell to: 4146953292 Farewell to: 4146953260 False >>> id(A()) == id(A()) Farewell to: 4146953420 Farewell to: 4146953420 True then there's nothing "broken" about the behavior of either expression. I personally think logically equivalent expressions should give the same results, but since both operations follow the rules of object identity correctly, it's not the end of the world. It's only surprising to the programmer if: 1) They don't understand identity. 2) They don't understand what objects are and are not temporaries. Code that relies on the identity of a temporary object is generally incorrect. This is why C++ explicitly forbids taking the address (identity) of temporaries. As such, the language behavior in your case is inconsequential. Making demons fly out of the programmer's nose would be equally appropriate. The other solution is to do what Java and C# do: banish id() entirely and only provide 'is' (== in Java, Object.ReferenceEquals() in C#). That seems just as fine, really, Practically, it's also probably the better solution for CPython, which is fine by me. My preference for keeping id() and removing 'is' probably comes from my background as a C ++ programmer, and I already said it matters very little. > But that's absolutely wrong. id(x) returns an ID, not an address. > It just > happens that, as an accident of implementation, the CPython interpreter > uses the object address as an ID, because objects can't move. That's not > the case for all implementations. In Jython, objects can move and the > address is not static, and so IDs are assigned on demand starting with 1: > > steve@runes:~$ jython > Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) > [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18 > Type "help", "copyright", "credits" or "license" for more information.>>> > id(42) > 1 > >>> id("Hello World!") > 2 > >>> id(None) > > 3 > An address is an identifier: a number that I can use to access a value[1]. I never said that id() must return an address the host CPU understands (virtual, physical, or otherwise). Most languages use addresses that the host CPU cannot understand without assistance at least sometimes, including C on some platforms. > Other implementations may make other choices. I don't believe that the > language even defines the id as a number, although I could be wrong about > that. http://docs.python.org/library/functions.html#id says it must be an integer of some sort. Even if it didn't say that, it hardly seems as a practical imposition. > > Personally, I prefer the Jython approach, because it avoids those > annoying questions like "How do I dereference the address of an > object?" (answer: Python is not C, you can't do that), The right way to solve that question isn't to fix the runtime, but to teach people what pointer semantics actually mean, much like the identity problem we're discussing now. Adam [1] I'd be more willing to accept a more general definition that allows for non-numeric addresses, but such things are rare. -- http://mail.python.org/mailman/listinfo/python-list