Re: is implemented with id ?
On Wednesday, September 5, 2012 10:41:19 PM UTC+8, Steven D'Aprano wrote: On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote: On 09/05/2012 09:19 AM, Franck Ditter wrote: Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Thanks, No, id() has nothing to do with physical address. The Python language does not specify anything about physical addresses. Some implementations may happen to use physical addresses, others arbitrary integers. And they may reuse such integers, or not. Up to the implementation. True. In principle, some day there might be a version of Python that runs on some exotic quantum computer where the very concept of physical address is meaningless. Or some sort of peptide or DNA computer, where the calculations are performed via molecular interactions rather than by flipping bits in fixed memory locations. But less exotically, Frank isn't entirely wrong. With current day computers, it is reasonable to say that any object has exactly one physical location at any time. In Jython, objects can move around; in CPython, they can't. But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. So, for current day computers at least, it is reasonable to say that a is b implies that a and b are the same object at a single location. The second half of the question is more complex: id(a) == id(b) *only* implies that a and b are the same object at the same location if they exist at the same time. If they don't exist at the same time, then you can't conclude anything. -- Steven The function id(x) might not be implemented as an address in the user space. Do we need to distinguish archived objets and objects in the memory? -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 4/11/12 06:09:24, Aahz wrote: In article mailman.3250.1351999198.27098.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: /* Shortcut for empty or interned objects */ if (v == u) { Py_DECREF(u); Py_DECREF(v); return 0; } result = unicode_compare(u, v); where v and u are pointers to the unicode object. There's a shortcut if they're the same. There's no shortcut if they're both interned and have different pointers, which is a guarantee that they're distinct strings. They'll still be compared char-for-char until there's a difference. Without looking at the code, I'm pretty sure there's a hash check first. In 3.3, there is no such check. It was recently proposed on python-dev to add such a check, but AFAIK, no action was taken. -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
[got some free time, catching up to threads two months old] In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: On 5/09/12 15:19:47, Franck Ditter wrote: - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as a and b are the same object. If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. That last sentence is not quite true. intern() is used to ensure that strings share a physical address to save memory. -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ Normal is what cuts off your sixth finger and your tail... --Siobhan -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 3/11/12 20:41:28, Aahz wrote: [got some free time, catching up to threads two months old] In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: On 5/09/12 15:19:47, Franck Ditter wrote: - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as a and b are the same object. If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. That last sentence is not quite true. intern() is used to ensure that strings share a physical address to save memory. That's a matter of perspective: in my book, the primary advantage of working with interned strings is that I can use 'is' rather than '==' to test for equality if I know my strings are interned. The space savings are minor; the time savings may be significant. -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote: On 3/11/12 20:41:28, Aahz wrote: [got some free time, catching up to threads two months old] In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: On 5/09/12 15:19:47, Franck Ditter wrote: - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as a and b are the same object. If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. That last sentence is not quite true. intern() is used to ensure that strings share a physical address to save memory. That's a matter of perspective: in my book, the primary advantage of working with interned strings is that I can use 'is' rather than '==' to test for equality if I know my strings are interned. The space savings are minor; the time savings may be significant. Actually, for many applications, the space savings may actually be *costs*, since interning forces Python to hold onto strings even after they would normally be garbage collected. CPython interns strings that look like identifiers. It really wouldn't be a good idea for it to automatically intern every string. You can make your own intern system with a simple dict: interned_strings = {} Then, for every string you care about, do: s = interned_strings.set_default(s, s) to ensure you are always working with a single string object for each unique value. In some applications that will save time at the expense of space. And there is no need to write is instead of ==, because string equality already optimizes the strings are identical case. By using ==, you don't get into bad habits, you defend against the odd un-interned string sneaking in, and you still have high speed equality tests. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
In article 50959154$0$6880$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: That's a matter of perspective: in my book, the primary advantage of working with interned strings is that I can use 'is' rather than '==' to test for equality if I know my strings are interned. The space savings are minor; the time savings may be significant. Depending on your problem domain, the space savings may be considerable. -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Sun, Nov 4, 2012 at 9:18 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote: Actually, for many applications, the space savings may actually be *costs*, since interning forces Python to hold onto strings even after they would normally be garbage collected. CPython interns strings that look like identifiers. It really wouldn't be a good idea for it to automatically intern every string. I don't know about that. /* This dictionary holds all interned unicode strings. Note that references to strings in this dictionary are *not* counted in the string's ob_refcnt. When the interned string reaches a refcnt of 0 the string deallocation function will delete the reference from this dictionary. Another way to look at this is that to say that the actual reference count of a string is: s-ob_refcnt + (s-state ? 2 : 0) */ static PyObject *interned; Empirical testing (on a Linux 3.3a0 that I had lying around) showed the process's memory usage drop, but I closed the terminal before copying and pasting (oops). Attempting to recreate in IDLE on 3.2 on Windows. a=$*1024*1024*256# Make $$$$$$ fast! import sys sys.getsizeof(a)# Clearly this is a narrow build 536870942 a=$*1024*1024*256 -- MemoryError. Blah. This is what I get for only having a gig and a half in this laptop. And I was working with 1024*1024*1024 on the other box. Start over... import sys a=$*1024*1024*128 b=$*1024*1024*128 a is b False a=sys.intern(a) b=sys.intern(b) c=$*1024*1024*128 c=sys.intern(c) Memory usage (according to Task Mangler) goes up to ~512MB when I create a new string (like c), then back down to ~256MB when I intern it. So far so good. del a,b,c Memory usage has dropped to 12MB. Unnecessarily-interned strings don't cost anything. (The source does refer to immortal interned strings, but AFAIK you can't create them in user-level code. At least, I didn't find it in help(sys.intern) which is the obvious place to look.) You can make your own intern system with a simple dict: interned_strings = {} Then, for every string you care about, do: s = interned_strings.set_default(s, s) to ensure you are always working with a single string object for each unique value. In some applications that will save time at the expense of space. Doing it manually like this _will_ leak like that, though, unless you periodically check sys.getrefcount and dispose of unreferenced entries. And there is no need to write is instead of ==, because string equality already optimizes the strings are identical case. By using ==, you don't get into bad habits, you defend against the odd un-interned string sneaking in, and you still have high speed equality tests. This one I haven't checked the source for, but ISTR discussions on this list about comparison of two unequal interned strings not being optimized, so they'll end up being compared char-for-char. Using 'is' guarantees that the check stops with identity. This may or may not be significant, and as you say, defending against an uninterned string slipping through is potentially critical. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote: This one I haven't checked the source for, but ISTR discussions on this list about comparison of two unequal interned strings not being optimized, so they'll end up being compared char-for-char. Using 'is' guarantees that the check stops with identity. This may or may not be significant, and as you say, defending against an uninterned string slipping through is potentially critical. The source is here (and it shows what you suggest): http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128 Comparing strings char for char is really not that big a deal though. This has been discussed before: you don't need to compare very many characters to conclude that strings are unequal (if I remember correctly you were part of that discussion). I can imagine cases where I might consider using intern on lots of strings to speed up comparisons but I would have to be involved in some seriously heavy and obscure string processing problem before I considered using 'is' to compare those interned strings. That is confusing to anyone who reads the code, prone to bugs and unlikely to achieve the desired outcome of speeding things up (noticeably). Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Sun, Nov 4, 2012 at 12:14 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote: This one I haven't checked the source for, but ISTR discussions on this list about comparison of two unequal interned strings not being optimized, so they'll end up being compared char-for-char. Using 'is' guarantees that the check stops with identity. This may or may not be significant, and as you say, defending against an uninterned string slipping through is potentially critical. The source is here (and it shows what you suggest): http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128 Comparing strings char for char is really not that big a deal though. This has been discussed before: you don't need to compare very many characters to conclude that strings are unequal (if I remember correctly you were part of that discussion). Yes, and a quite wide-ranging discussion it was too! What color did we end up whitewashing that bikeshed? *whistles innocently* I can imagine cases where I might consider using intern on lots of strings to speed up comparisons but I would have to be involved in some seriously heavy and obscure string processing problem before I considered using 'is' to compare those interned strings. That is confusing to anyone who reads the code, prone to bugs and unlikely to achieve the desired outcome of speeding things up (noticeably). Good point. It's still true that 'is' will be faster, it's just not worth it. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Sun, 04 Nov 2012 01:14:29 +, Oscar Benjamin wrote: On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote: This one I haven't checked the source for, but ISTR discussions on this list about comparison of two unequal interned strings not being optimized, so they'll end up being compared char-for-char. Using 'is' guarantees that the check stops with identity. This may or may not be significant, and as you say, defending against an uninterned string slipping through is potentially critical. The source is here (and it shows what you suggest): http://hg.python.org/cpython/file/6c639a1ff53d/Objects/ unicodeobject.c#l6128 I don't think it does, although I could be wrong, I find reading C to be quite difficult. The unicode_compare function compares character by character, true, but it doesn't get called directly. The public interface is PyUnicode_Compare, which includes this test before calling unicode_compare: /* Shortcut for empty or interned objects */ if (v == u) { Py_DECREF(u); Py_DECREF(v); return 0; } result = unicode_compare(u, v); where v and u are pointers to the unicode object. So it appears that the test for strings being equal length have been dropped, but the identity test is still present. Comparing strings char for char is really not that big a deal though. Depends on how big the string and where the first difference is. This has been discussed before: you don't need to compare very many characters to conclude that strings are unequal (if I remember correctly you were part of that discussion). On average. Worst case, you have to look at every character. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: /* Shortcut for empty or interned objects */ if (v == u) { Py_DECREF(u); Py_DECREF(v); return 0; } result = unicode_compare(u, v); where v and u are pointers to the unicode object. There's a shortcut if they're the same. There's no shortcut if they're both interned and have different pointers, which is a guarantee that they're distinct strings. They'll still be compared char-for-char until there's a difference. But it probably isn't enough of a performance penalty to be concerned with. It's enough to technically prove the point that 'is' is faster than '==' and is still safe if both strings are interned; it's not enough to make 'is' better than '==', except in very specific situations. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
In article mailman.3250.1351999198.27098.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: /* Shortcut for empty or interned objects */ if (v == u) { Py_DECREF(u); Py_DECREF(v); return 0; } result = unicode_compare(u, v); where v and u are pointers to the unicode object. There's a shortcut if they're the same. There's no shortcut if they're both interned and have different pointers, which is a guarantee that they're distinct strings. They'll still be compared char-for-char until there's a difference. Without looking at the code, I'm pretty sure there's a hash check first. -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ Normal is what cuts off your sixth finger and your tail... --Siobhan -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
In article 50959827$0$29967$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Actually, for many applications, the space savings may actually be *costs*, since interning forces Python to hold onto strings even after they would normally be garbage collected. That's old news, fixed in 2.5 or 2.6 IIRC -- interned strings now get collected by refcounting like everything else. -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ Normal is what cuts off your sixth finger and your tail... --Siobhan -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
In article 50959154$0$6880$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: On 3/11/12 20:41:28, Aahz wrote: In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl, Hans Mulder han...@xs4all.nl wrote: On 5/09/12 15:19:47, Franck Ditter wrote: - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as a and b are the same object. If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. That last sentence is not quite true. intern() is used to ensure that strings share a physical address to save memory. That's a matter of perspective: in my book, the primary advantage of working with interned strings is that I can use 'is' rather than '==' to test for equality if I know my strings are interned. The space savings are minor; the time savings may be significant. As others have pointed out, using ``is`` with strings is a Bad Habit likely leading to nasty, hard-to-find bugs. intern() costs time, but saves considerable space in any application with lots of duplicate computed strings (hundreds of megabytes in some cases). -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ Normal is what cuts off your sixth finger and your tail... --Siobhan -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote: On 9/5/2012 8:48 AM, Ramchandra Apte wrote: and a==True should be automatically changed into memory comparison. I have no idea what that means. I interpret this as meaning that a == True should be special-cased by the interpreter as a is True instead of calling a.__eq__. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Thursday, 6 September 2012 12:14:19 UTC+5:30, Steven D'Aprano wrote: On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote: On 9/5/2012 8:48 AM, Ramchandra Apte wrote: and a==True should be automatically changed into memory comparison. I have no idea what that means. I interpret this as meaning that a == True should be special-cased by the interpreter as a is True instead of calling a.__eq__. -- Steven Steven you are right. -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wednesday, 5 September 2012 19:43:30 UTC+5:30, Steven D'Aprano wrote: On Wed, 05 Sep 2012 05:48:26 -0700, Ramchandra Apte wrote: Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) A terrible idea. Because is is a keyword, it is implemented as a fast object comparison directly in C (for CPython) or Java (for Jython). In the C implementation x is y is *extremely* fast because it is just a pointer comparison performed directly by the interpreter. Because id() is a function, it is much slower. And because it is not a keyword, Python needs to do a name look-up for it, then push the argument on the stack, call the function (which may not even be the built-in id() any more!) and then pop back to the caller. And worst, *it doesn't even do what you think it does*. In some Python implementations, IDs can be reused. That leads to code like this, from CPython 2.7: py id(spam ham[1:]) == id(foo bar[1:]) True You *cannot* replace is with id() except when the objects are guaranteed to both be alive at the same time, and even then you *shouldn't* replace is with id() because that is a pessimation (the opposite of an optimization -- something that makes code run slower, not faster). and a==True should be automatically changed into memory comparison. Absolutely not. That would be a backward-incompatible change that would break existing programs: py 1.0 == True True py from decimal import Decimal py Decimal(1.) == True True -- Steven the is statement could be made into a function -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Thu, Sep 6, 2012 at 6:26 PM, Ramchandra Apte maniandra...@gmail.com wrote: the is statement could be made into a function It's not a statement, it's an operator; and functions have far more overhead than direct operators. There's little benefit in making 'is' into a function, and high cost; unlike 'print', whose cost is dominated by the cost of producing output to a console or similar device, 'is' would be dominated by the cost of name lookups and function call overhead. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: But less exotically, Frank isn't entirely wrong. With current day computers, it is reasonable to say that any object has exactly one physical location at any time. In Jython, objects can move around; in CPython, they can't. But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. It is however perfectly possible for one object to be at two or more memory addresses at the same time. In fact some work being done in PyPy right now is doing exactly that as part of Armin Rigo's software transactional memory implementation: when a global object is mutated a new copy is made and some threads may see the new version while other threads continue to see the old version until their transactions are comitted (or aborted). This means that global objects can be safely read from multiple threads without any semaphore locking. See http://mail.python.org/pipermail/pypy-dev/2012-September/010513.html -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Thu, Sep 6, 2012 at 7:34 PM, Duncan Booth duncan.booth@invalid.invalid wrote: Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. It is however perfectly possible for one object to be at two or more memory addresses at the same time. And of course, memory addresses have to be taken as per-process, since it's entirely possible for two processes to reuse addresses. But I think all these considerations of object identity are made with the assumption that we're working within a single Python process. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
In article 50484643$0$29977$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote: On 9/5/2012 8:48 AM, Ramchandra Apte wrote: and a==True should be automatically changed into memory comparison. I have no idea what that means. I interpret this as meaning that a == True should be special-cased by the interpreter as a is True instead of calling a.__eq__. That would break classes which provide their own __eq__() method. -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Thursday, 6 September 2012 17:46:38 UTC+5:30, Roy Smith wrote: In article 50484643$0$29977$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote: On 9/5/2012 8:48 AM, Ramchandra Apte wrote: and a==True should be automatically changed into memory comparison. I have no idea what that means. I interpret this as meaning that a == True should be special-cased by the interpreter as a is True instead of calling a.__eq__. That would break classes which provide their own __eq__() method. There is a way of doing this: make True.__req__ = lambda other: self is other -- http://mail.python.org/mailman/listinfo/python-list
is implemented with id ?
Hi ! a is b == id(a) == id(b) in builtin classes. Is that true ? Thanks, franck -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter fra...@ditter.org wrote: Hi ! a is b == id(a) == id(b) in builtin classes. Is that true ? Thanks, franck No. It is true that if a is b then id(a) == id(b) but the reverse is not necessarily true. id is only guaranteed to be unique among objects alive at the same time. If objects are discarded, their ids may be reused even though the objects are not the same. -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, 05 Sep 2012 08:30:31 +0200, Franck Ditter wrote: Hi ! a is b == id(a) == id(b) in builtin classes. Is that true ? Not just for builtin classes, for any objects, provided that they are alive at the same time. There is no guarantee whether IDs will be re-used. Some versions of Python do re-use IDs, e.g. CPython: steve@runes:~$ python Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. a = [some, object] id(a) 3074285228L del a b = [100, 200] id(b) 3074285228L but others do not, e.g. Jython and IronPython: steve@runes:~$ jython Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18 Type help, copyright, credits or license for more information. a = [some, object] id(a) 1 del a b = [100, 200] id(b) 2 steve@runes:~$ ipy IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433 Type help, copyright, credits or license for more information. a = [some, object] id(a) 43 del a b = [100, 200] id(b) 44 CPython especially has the most complicated behaviour with IDs and object identity: a = 99.99 b = 99.99 a is b False a = 99.99; b = 99.99; a is b True In general, you almost never need to care about IDs and object identity. The main exception is testing for None, which should always be written as: if x is None -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wednesday, 5 September 2012 14:44:23 UTC+5:30, Steven D'Aprano wrote: On Wed, 05 Sep 2012 08:30:31 +0200, Franck Ditter wrote: Hi ! a is b == id(a) == id(b) in builtin classes. Is that true ? Not just for builtin classes, for any objects, provided that they are Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) and a==True should be automatically changed into memory comparison. alive at the same time. There is no guarantee whether IDs will be re-used. Some versions of Python do re-use IDs, e.g. CPython: steve@runes:~$ python Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. a = [some, object] id(a) 3074285228L del a b = [100, 200] id(b) 3074285228L but others do not, e.g. Jython and IronPython: steve@runes:~$ jython Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18 Type help, copyright, credits or license for more information. a = [some, object] id(a) 1 del a b = [100, 200] id(b) 2 steve@runes:~$ ipy IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433 Type help, copyright, credits or license for more information. a = [some, object] id(a) 43 del a b = [100, 200] id(b) 44 CPython especially has the most complicated behaviour with IDs and object identity: a = 99.99 b = 99.99 a is b False a = 99.99; b = 99.99; a is b True In general, you almost never need to care about IDs and object identity. The main exception is testing for None, which should always be written as: if x is None -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Thanks, franck In article mailman.213.1346827305.27098.python-l...@python.org, Benjamin Kaplan benjamin.kap...@case.edu wrote: On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter fra...@ditter.org wrote: Hi ! a is b == id(a) == id(b) in builtin classes. Is that true ? Thanks, franck No. It is true that if a is b then id(a) == id(b) but the reverse is not necessarily true. id is only guaranteed to be unique among objects alive at the same time. If objects are discarded, their ids may be reused even though the objects are not the same. -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 09/05/2012 08:48 AM, Ramchandra Apte wrote: On Wednesday, 5 September 2012 14:44:23 UTC+5:30, Steven D'Aprano wrote: snip Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) and a==True should be automatically changed into memory comparison. You didn't read the whole message carefully enough. Id's can be reused, so there are many ways to mess up comparing id's. One is if the two items x and y are expressions (eg. function calls). You call a function, and say it returns a new object, you call id() on that object, and then the object gets discarded. You now have a stale id, and you haven't even evaluated the second expression yet. It's id() which is superfluous. But it's useful for debugging, and for understanding. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 5/09/12 15:19:47, Franck Ditter wrote: Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as a and b are the same object. If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. Keep in mind that physical addresses can be reused when an object is destroyed. For example, in my Python3, id(math.sqrt(17)) == id(math.cos(17)) returns True, even though the floats involved are different, because the flaots have non-overlapping lifetimes and the physical address happens to be reused. Hope this helps, -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
Please don't top-post. Now your message is out of order, and if I have to delete the part Benjamin said. On 09/05/2012 09:19 AM, Franck Ditter wrote: Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Thanks, No, id() has nothing to do with physical address. The Python language does not specify anything about physical addresses. Some implementations may happen to use physical addresses, others arbitrary integers. And they may reuse such integers, or not. Up to the implementation. And as others have pointed out, when you compare two id's, you're risking that one of them may no longer be valid. For example, the following expression: flag = id(func1()) == id(func2()) could very well evaluate to True, even if func1() always returns a string, and func2() always returns an int. On the other hand, the 'is' expression makes sure the two expressions are bound to the same object. If a and b are simple names, and not placeholders for arbitrary expressions, then I THINK the following would be true: a is b and id(a) == id(b) both mean that the names a and b are bound to the same object at the time the statement is executed. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, 05 Sep 2012 05:48:26 -0700, Ramchandra Apte wrote: Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) A terrible idea. Because is is a keyword, it is implemented as a fast object comparison directly in C (for CPython) or Java (for Jython). In the C implementation x is y is *extremely* fast because it is just a pointer comparison performed directly by the interpreter. Because id() is a function, it is much slower. And because it is not a keyword, Python needs to do a name look-up for it, then push the argument on the stack, call the function (which may not even be the built-in id() any more!) and then pop back to the caller. And worst, *it doesn't even do what you think it does*. In some Python implementations, IDs can be reused. That leads to code like this, from CPython 2.7: py id(spam ham[1:]) == id(foo bar[1:]) True You *cannot* replace is with id() except when the objects are guaranteed to both be alive at the same time, and even then you *shouldn't* replace is with id() because that is a pessimation (the opposite of an optimization -- something that makes code run slower, not faster). and a==True should be automatically changed into memory comparison. Absolutely not. That would be a backward-incompatible change that would break existing programs: py 1.0 == True True py from decimal import Decimal py Decimal(1.) == True True -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote: On 09/05/2012 09:19 AM, Franck Ditter wrote: Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Thanks, No, id() has nothing to do with physical address. The Python language does not specify anything about physical addresses. Some implementations may happen to use physical addresses, others arbitrary integers. And they may reuse such integers, or not. Up to the implementation. True. In principle, some day there might be a version of Python that runs on some exotic quantum computer where the very concept of physical address is meaningless. Or some sort of peptide or DNA computer, where the calculations are performed via molecular interactions rather than by flipping bits in fixed memory locations. But less exotically, Frank isn't entirely wrong. With current day computers, it is reasonable to say that any object has exactly one physical location at any time. In Jython, objects can move around; in CPython, they can't. But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. So, for current day computers at least, it is reasonable to say that a is b implies that a and b are the same object at a single location. The second half of the question is more complex: id(a) == id(b) *only* implies that a and b are the same object at the same location if they exist at the same time. If they don't exist at the same time, then you can't conclude anything. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 09/05/2012 10:41 AM, Steven D'Aprano wrote: On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote: On 09/05/2012 09:19 AM, Franck Ditter wrote: Thanks to all, but : - I should have said that I work with Python 3. Does that matter ? - May I reformulate the queston : a is b and id(a) == id(b) both mean : a et b share the same physical address. Is that True ? Thanks, No, id() has nothing to do with physical address. The Python language does not specify anything about physical addresses. Some implementations may happen to use physical addresses, others arbitrary integers. And they may reuse such integers, or not. Up to the implementation. True. In principle, some day there might be a version of Python that runs on some exotic quantum computer where the very concept of physical address is meaningless. Or some sort of peptide or DNA computer, where the calculations are performed via molecular interactions rather than by flipping bits in fixed memory locations. But less exotically, Frank isn't entirely wrong. With current day computers, it is reasonable to say that any object has exactly one physical location at any time. In Jython, objects can move around; in CPython, they can't. But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. So, for current day computers at least, it is reasonable to say that a is b implies that a and b are the same object at a single location. You're arguing against something i didn't say. I only said that id() doesn't promise to be a memory address. i said nothing about what it might mean if the is operator considers them the same. The second half of the question is more complex: id(a) == id(b) *only* implies that a and b are the same object at the same location if they exist at the same time. If they don't exist at the same time, then you can't conclude anything. But by claiming that id() really means address, and that those addresses might move during the lifetime of an object, then the fact that the id() functions are not called simultaneously implies that one object might move to where the other one used to be before the move. I don't claim to know the jython implementation. But you're claiming that id() means the address of the object, even in jython. So if a garbage collection can occur during the evaluation of the expression id(a) == id(b) then the comparing of id()'s would be useless in jython. Two distinct objects could each be moved during evaluation, (very) coincidentally causing the two to have the same addresses at the two times of evaluation. Or more likely, a single object could move to a new location, rendering the comparison false. Thus you have false positive and false negative possible. I think it much more likely that jython uses integer values for the id() function, and not physical addresses. I doubt they'd want a race condition. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, 05 Sep 2012 11:09:30 -0400, Dave Angel wrote: On 09/05/2012 10:41 AM, Steven D'Aprano wrote: [...] So, for current day computers at least, it is reasonable to say that a is b implies that a and b are the same object at a single location. You're arguing against something i didn't say. I only said that id() doesn't promise to be a memory address. i said nothing about what it might mean if the is operator considers them the same. I'm not arguing at all. I'm agreeing with you, but going into more detail. The second half of the question is more complex: id(a) == id(b) *only* implies that a and b are the same object at the same location if they exist at the same time. If they don't exist at the same time, then you can't conclude anything. But by claiming that id() really means address, I didn't actually say that. If you re-read Franck Ditter's previous post, he doesn't actually say that either. and that those addresses might move during the lifetime of an object, then the fact that the id() functions are not called simultaneously implies that one object might move to where the other one used to be before the move. Well, yes, but I expect that implementations where objects can move will not use memory addresses as IDs. They will do what Jython and IronPython do and use arbitrary numbers as IDs. (Oh how I wish CPython hadn't used memory addresses as IDs.) I don't claim to know the jython implementation. But you're claiming that id() means the address of the object, even in jython. Good god no! I'm saying that, *if* a and b exist at the same time, *and* if id(a) == id(b), *then* a and b must be the same object and therefore at the same address. That doesn't mean that the ID is the address! I think it much more likely that jython uses integer values for the id() function, and not physical addresses. That's exactly what it does. It appears to be a simple counter: each time you ask for an object's ID, it gets allocated the next value starting from 1, and values are never re-used. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 09/05/12 08:46, Dave Angel wrote: It's id() which is superfluous. But it's useful for debugging, and for understanding. While I assiduously work to eschew shadowing most built-in names such as list or str, I do make an exception for id because it's *so* useful in code, and the built-in nets me almost nothing (well, nothing that I generally care about) that is doesn't already provide me. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 5/09/12 17:09:30, Dave Angel wrote: But by claiming that id() really means address, and that those addresses might move during the lifetime of an object, then the fact that the id() functions are not called simultaneously implies that one object might move to where the other one used to be before the move. Whoa! Not so fast! The id() of an object is guaranteed to not change during the object's lifetime. So if an implementation moves objects around (e.g. Jython), then it cannot use memory addresses for the id() function. I think it much more likely that jython uses integer values for the id() function, and not physical addresses. The id() function is guaranteed to return some flavour of integer. In Jython, the return values are 1, 2, 3, 4, etc., except, of course, if you invoke id() on an object you've id'd before, you get the same number as before. In current versions of CPython, you do get the (virtual) memory address, converted to an int (or a long). But then, CPython does not move objects. Maybe the next version of CPython should shift id values two or three bits to the right, just to make sure people don't misinterpret ids as memory addresses. Hope this helps, -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On Wed, Sep 5, 2012 at 8:13 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: You *cannot* replace is with id() except when the objects are guaranteed to both be alive at the same time, and even then you *shouldn't* replace is with id() because that is a pessimation (the opposite of an optimization -- something that makes code run slower, not faster). Shouldn't that be pessimization for symmetry? -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 09/05/2012 12:47 PM, Hans Mulder wrote: On 5/09/12 17:09:30, Dave Angel wrote: But by claiming that id() really means address, and that those addresses might move during the lifetime of an object, then the fact that the id() functions are not called simultaneously implies that one object might move to where the other one used to be before the move. Whoa! Not so fast! The id() of an object is guaranteed to not change during the object's lifetime. So if an implementation moves objects around (e.g. Jython), then it cannot use memory addresses for the id() function. Which is equivalent to my point. I had mistakenly thought that Steven was claiming it was always an address, and disproving that claim by what amounts to reductio ad absurdem. I think it much more likely that jython uses integer values for the id() function, and not physical addresses. The id() function is guaranteed to return some flavour of integer. In Jython, the return values are 1, 2, 3, 4, etc., except, of course, if you invoke id() on an object you've id'd before, you get the same number as before. In current versions of CPython, you do get the (virtual) memory address, converted to an int (or a long). But then, CPython does not move objects. Maybe the next version of CPython should shift id values two or three bits to the right, just to make sure people don't misinterpret ids as memory addresses. I think i'd prefer if it would put it through one step of a CRC32. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 9/5/2012 8:48 AM, Ramchandra Apte wrote: Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) The thread is wrong then. If the implementation reuses ids, which CPython does, expression-1 is expression-2 must be implemented as internal-tem1 = expression-1 internal-tem2 = expression-2 id(internal-tem1) == id(internal-tem2) in order to ensure that the two objects exist simultaneously, so that the id comparison is valid. and a==True should be automatically changed into memory comparison. I have no idea what that means. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 9/5/2012 10:41 AM, Steven D'Aprano wrote: True. In principle, some day there might be a version of Python that runs on some exotic quantum computer where the very concept of physical address is meaningless. You mean like the human brain? When people execute Python code, does 0 have an address? -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: is implemented with id ?
On 09/05/2012 02:27 PM, Terry Reedy wrote: On 9/5/2012 8:48 AM, Ramchandra Apte wrote: Seeing this thread, I think the is statment should be removed. It has a replacement syntax of id(x) == id(y) The thread is wrong then. If the implementation reuses ids, which CPython does, expression-1 is expression-2 must be implemented as internal-tem1 = expression-1 internal-tem2 = expression-2 id(internal-tem1) == id(internal-tem2) in order to ensure that the two objects exist simultaneously, so that the id comparison is valid. and a==True should be automatically changed into memory comparison. I have no idea what that means. It's probably a response to Steve's comment In general, you almost never need to care about IDs and object identity. The main exception is testing for None, which should always be written as: if x is None Somehow he substituted True for None. Anyway, if one eliminates is then Steve's comment wouldn't apply. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list