Re: is implemented with id ?

2012-11-04 Thread 88888 Dihedral
On Wednesday, September 5, 2012 10:41:19 PM UTC+8, Steven D'Aprano wrote:
 On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote:
 
 
 
  On 09/05/2012 09:19 AM, Franck Ditter wrote:
 
  Thanks to all, but :
 
  - I should have said that I work with Python 3. Does that matter ? -
 
  May I reformulate the queston : a is b and id(a) == id(b)
 
both mean : a et b share the same physical address. Is that True ?
 
  Thanks,
 
  
 
  No, id() has nothing to do with physical address.  The Python language
 
  does not specify anything about physical addresses.  Some
 
  implementations may happen to use physical addresses, others arbitrary
 
  integers.  And they may reuse such integers, or not.  Up to the
 
  implementation.
 
 
 
 True. In principle, some day there might be a version of Python that runs 
 
 on some exotic quantum computer where the very concept of physical 
 
 address is meaningless. Or some sort of peptide or DNA computer, where 
 
 the calculations are performed via molecular interactions rather than by 
 
 flipping bits in fixed memory locations.
 
 
 
 But less exotically, Frank isn't entirely wrong. With current day 
 
 computers, it is reasonable to say that any object has exactly one 
 
 physical location at any time. In Jython, objects can move around; in 
 
 CPython, they can't. But at any moment, any object has a specific 
 
 location, and no other object can have that same location. Two objects 
 
 cannot both be at the same memory address at the same time.
 
 
 
 So, for current day computers at least, it is reasonable to say that 
 
 a is b implies that a and b are the same object at a single location.
 
 
 
 The second half of the question is more complex:
 
 
 
 id(a) == id(b) *only* implies that a and b are the same object at the 
 
 same location if they exist at the same time. If they don't exist at the 
 
 same time, then you can't conclude anything.
 
 
 
 
 
 
 
 -- 
 
 Steven
The function id(x) might not be implemented 
as an address in the user space. 

Do we need to distinguish archived objets and 
objects in the memory?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-04 Thread Hans Mulder
On 4/11/12 06:09:24, Aahz wrote:
 In article mailman.3250.1351999198.27098.python-l...@python.org,
 Chris Angelico  ros...@gmail.com wrote:
 On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:

 /* Shortcut for empty or interned objects */
 if (v == u) {
 Py_DECREF(u);
 Py_DECREF(v);
 return 0;
 }
 result = unicode_compare(u, v);

 where v and u are pointers to the unicode object.

 There's a shortcut if they're the same. There's no shortcut if they're
 both interned and have different pointers, which is a guarantee that
 they're distinct strings. They'll still be compared char-for-char
 until there's a difference.
 
 Without looking at the code, I'm pretty sure there's a hash check first.

In 3.3, there is no such check.

It was recently proposed on python-dev to add such a check,
but AFAIK, no action was taken.

-- HansM


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Aahz
[got some free time, catching up to threads two months old]

In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl,
Hans Mulder  han...@xs4all.nl wrote:
On 5/09/12 15:19:47, Franck Ditter wrote:

 - I should have said that I work with Python 3. Does that matter ?
 - May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?

Yes.

Keep in mind, though, that in some implementation (e.g.  Jython), the
physical address may change during the life time of an object.

It's usually phrased as a and b are the same object.  If the object
is mutable, then changing a will also change b.  If a and b aren't
mutable, then it doesn't really matter whether they share a physical
address.

That last sentence is not quite true.  intern() is used to ensure that
strings share a physical address to save memory.
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Normal is what cuts off your sixth finger and your tail...  --Siobhan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Hans Mulder
On 3/11/12 20:41:28, Aahz wrote:
 [got some free time, catching up to threads two months old]
 
 In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl,
 Hans Mulder  han...@xs4all.nl wrote:
 On 5/09/12 15:19:47, Franck Ditter wrote:

 - I should have said that I work with Python 3. Does that matter ?
 - May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?

 Yes.

 Keep in mind, though, that in some implementation (e.g.  Jython), the
 physical address may change during the life time of an object.

 It's usually phrased as a and b are the same object.  If the object
 is mutable, then changing a will also change b.  If a and b aren't
 mutable, then it doesn't really matter whether they share a physical
 address.
 
 That last sentence is not quite true.  intern() is used to ensure that
 strings share a physical address to save memory.

That's a matter of perspective: in my book, the primary advantage of
working with interned strings is that I can use 'is' rather than '=='
to test for equality if I know my strings are interned.  The space
savings are minor; the time savings may be significant.

-- HansM
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Steven D'Aprano
On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote:

 On 3/11/12 20:41:28, Aahz wrote:
 [got some free time, catching up to threads two months old]
 
 In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl, Hans Mulder
  han...@xs4all.nl wrote:
 On 5/09/12 15:19:47, Franck Ditter wrote:

 - I should have said that I work with Python 3. Does that matter ? -
 May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True
   ?

 Yes.

 Keep in mind, though, that in some implementation (e.g.  Jython), the
 physical address may change during the life time of an object.

 It's usually phrased as a and b are the same object.  If the object
 is mutable, then changing a will also change b.  If a and b aren't
 mutable, then it doesn't really matter whether they share a physical
 address.
 
 That last sentence is not quite true.  intern() is used to ensure that
 strings share a physical address to save memory.
 
 That's a matter of perspective: in my book, the primary advantage of
 working with interned strings is that I can use 'is' rather than '==' to
 test for equality if I know my strings are interned.  The space savings
 are minor; the time savings may be significant.

Actually, for many applications, the space savings may actually be 
*costs*, since interning forces Python to hold onto strings even after 
they would normally be garbage collected. CPython interns strings that 
look like identifiers. It really wouldn't be a good idea for it to 
automatically intern every string.

You can make your own intern system with a simple dict:

interned_strings = {}

Then, for every string you care about, do:

s = interned_strings.set_default(s, s)

to ensure you are always working with a single string object for each 
unique value. In some applications that will save time at the expense of 
space.

And there is no need to write is instead of ==, because string 
equality already optimizes the strings are identical case. By using ==, 
you don't get into bad habits, you defend against the odd un-interned 
string sneaking in, and you still have high speed equality tests.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Roy Smith
In article 50959154$0$6880$e4fe5...@news2.news.xs4all.nl,
 Hans Mulder han...@xs4all.nl wrote:

 That's a matter of perspective: in my book, the primary advantage of
 working with interned strings is that I can use 'is' rather than '=='
 to test for equality if I know my strings are interned.  The space
 savings are minor; the time savings may be significant.

Depending on your problem domain, the space savings may be considerable.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Chris Angelico
On Sun, Nov 4, 2012 at 9:18 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote:
 Actually, for many applications, the space savings may actually be
 *costs*, since interning forces Python to hold onto strings even after
 they would normally be garbage collected. CPython interns strings that
 look like identifiers. It really wouldn't be a good idea for it to
 automatically intern every string.

I don't know about that.

/* This dictionary holds all interned unicode strings.  Note that references
   to strings in this dictionary are *not* counted in the string's ob_refcnt.
   When the interned string reaches a refcnt of 0 the string deallocation
   function will delete the reference from this dictionary.

   Another way to look at this is that to say that the actual reference
   count of a string is:  s-ob_refcnt + (s-state ? 2 : 0)
*/
static PyObject *interned;

Empirical testing (on a Linux 3.3a0 that I had lying around) showed
the process's memory usage drop, but I closed the terminal before
copying and pasting (oops). Attempting to recreate in IDLE on 3.2 on
Windows.

 a=$*1024*1024*256# Make $$$$$$ fast!
 import sys
 sys.getsizeof(a)# Clearly this is a narrow build
536870942
 a=$*1024*1024*256
-- MemoryError. Blah. This is what I get for only having a gig and a
half in this laptop. And I was working with 1024*1024*1024 on the
other box. Start over...

 import sys
 a=$*1024*1024*128
 b=$*1024*1024*128
 a is b
False
 a=sys.intern(a)
 b=sys.intern(b)
 c=$*1024*1024*128
 c=sys.intern(c)

Memory usage (according to Task Mangler) goes up to ~512MB when I
create a new string (like c), then back down to ~256MB when I intern
it. So far so good.

 del a,b,c

Memory usage has dropped to 12MB. Unnecessarily-interned strings don't
cost anything. (The source does refer to immortal interned strings,
but AFAIK you can't create them in user-level code. At least, I didn't
find it in help(sys.intern) which is the obvious place to look.)

 You can make your own intern system with a simple dict:

 interned_strings = {}

 Then, for every string you care about, do:

 s = interned_strings.set_default(s, s)

 to ensure you are always working with a single string object for each
 unique value. In some applications that will save time at the expense of
 space.

Doing it manually like this _will_ leak like that, though, unless you
periodically check sys.getrefcount and dispose of unreferenced
entries.

 And there is no need to write is instead of ==, because string
 equality already optimizes the strings are identical case. By using ==,
 you don't get into bad habits, you defend against the odd un-interned
 string sneaking in, and you still have high speed equality tests.

This one I haven't checked the source for, but ISTR discussions on
this list about comparison of two unequal interned strings not being
optimized, so they'll end up being compared char-for-char. Using 'is'
guarantees that the check stops with identity. This may or may not be
significant, and as you say, defending against an uninterned string
slipping through is potentially critical.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Oscar Benjamin
On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote:
 This one I haven't checked the source for, but ISTR discussions on
 this list about comparison of two unequal interned strings not being
 optimized, so they'll end up being compared char-for-char. Using 'is'
 guarantees that the check stops with identity. This may or may not be
 significant, and as you say, defending against an uninterned string
 slipping through is potentially critical.

The source is here (and it shows what you suggest):
http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128

Comparing strings char for char is really not that big a deal though.
This has been discussed before: you don't need to compare very many
characters to conclude that strings are unequal (if I remember
correctly you were part of that discussion).

I can imagine cases where I might consider using intern on lots of
strings to speed up comparisons but I would have to be involved in
some seriously heavy and obscure string processing problem before I
considered using 'is' to compare those interned strings. That is
confusing to anyone who reads the code, prone to bugs and unlikely to
achieve the desired outcome of speeding things up (noticeably).


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Chris Angelico
On Sun, Nov 4, 2012 at 12:14 PM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
 On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote:
 This one I haven't checked the source for, but ISTR discussions on
 this list about comparison of two unequal interned strings not being
 optimized, so they'll end up being compared char-for-char. Using 'is'
 guarantees that the check stops with identity. This may or may not be
 significant, and as you say, defending against an uninterned string
 slipping through is potentially critical.

 The source is here (and it shows what you suggest):
 http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128

 Comparing strings char for char is really not that big a deal though.
 This has been discussed before: you don't need to compare very many
 characters to conclude that strings are unequal (if I remember
 correctly you were part of that discussion).

Yes, and a quite wide-ranging discussion it was too! What color did we
end up whitewashing that bikeshed? *whistles innocently*

 I can imagine cases where I might consider using intern on lots of
 strings to speed up comparisons but I would have to be involved in
 some seriously heavy and obscure string processing problem before I
 considered using 'is' to compare those interned strings. That is
 confusing to anyone who reads the code, prone to bugs and unlikely to
 achieve the desired outcome of speeding things up (noticeably).

Good point. It's still true that 'is' will be faster, it's just not worth it.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Steven D'Aprano
On Sun, 04 Nov 2012 01:14:29 +, Oscar Benjamin wrote:

 On 3 November 2012 22:50, Chris Angelico ros...@gmail.com wrote:
 This one I haven't checked the source for, but ISTR discussions on this
 list about comparison of two unequal interned strings not being
 optimized, so they'll end up being compared char-for-char. Using 'is'
 guarantees that the check stops with identity. This may or may not be
 significant, and as you say, defending against an uninterned string
 slipping through is potentially critical.
 
 The source is here (and it shows what you suggest):
 http://hg.python.org/cpython/file/6c639a1ff53d/Objects/
unicodeobject.c#l6128

I don't think it does, although I could be wrong, I find reading C to be 
quite difficult.

The unicode_compare function compares character by character, true, but 
it doesn't get called directly. The public interface is 
PyUnicode_Compare, which includes this test before calling 
unicode_compare:

/* Shortcut for empty or interned objects */
if (v == u) {
Py_DECREF(u);
Py_DECREF(v);
return 0;
}
result = unicode_compare(u, v);

where v and u are pointers to the unicode object.

So it appears that the test for strings being equal length have been 
dropped, but the identity test is still present.

 Comparing strings char for char is really not that big a deal though.

Depends on how big the string and where the first difference is.

 This has been discussed before: you don't need to compare very many
 characters to conclude that strings are unequal (if I remember correctly
 you were part of that discussion).

On average. Worst case, you have to look at every character.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Chris Angelico
On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 /* Shortcut for empty or interned objects */
 if (v == u) {
 Py_DECREF(u);
 Py_DECREF(v);
 return 0;
 }
 result = unicode_compare(u, v);

 where v and u are pointers to the unicode object.

There's a shortcut if they're the same. There's no shortcut if they're
both interned and have different pointers, which is a guarantee that
they're distinct strings. They'll still be compared char-for-char
until there's a difference.

But it probably isn't enough of a performance penalty to be concerned
with. It's enough to technically prove the point that 'is' is faster
than '==' and is still safe if both strings are interned; it's not
enough to make 'is' better than '==', except in very specific
situations.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Aahz
In article mailman.3250.1351999198.27098.python-l...@python.org,
Chris Angelico  ros...@gmail.com wrote:
On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:

 /* Shortcut for empty or interned objects */
 if (v == u) {
 Py_DECREF(u);
 Py_DECREF(v);
 return 0;
 }
 result = unicode_compare(u, v);

 where v and u are pointers to the unicode object.

There's a shortcut if they're the same. There's no shortcut if they're
both interned and have different pointers, which is a guarantee that
they're distinct strings. They'll still be compared char-for-char
until there's a difference.

Without looking at the code, I'm pretty sure there's a hash check first.
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Normal is what cuts off your sixth finger and your tail...  --Siobhan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Aahz
In article 50959827$0$29967$c3e8da3$54964...@news.astraweb.com,
Steven D'Aprano  steve+comp.lang.pyt...@pearwood.info wrote:

Actually, for many applications, the space savings may actually be 
*costs*, since interning forces Python to hold onto strings even after 
they would normally be garbage collected. 

That's old news, fixed in 2.5 or 2.6 IIRC -- interned strings now get
collected by refcounting like everything else.
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Normal is what cuts off your sixth finger and your tail...  --Siobhan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-11-03 Thread Aahz
In article 50959154$0$6880$e4fe5...@news2.news.xs4all.nl,
Hans Mulder  han...@xs4all.nl wrote:
On 3/11/12 20:41:28, Aahz wrote:
 In article 50475822$0$6867$e4fe5...@news2.news.xs4all.nl,
 Hans Mulder  han...@xs4all.nl wrote:
 On 5/09/12 15:19:47, Franck Ditter wrote:

 - I should have said that I work with Python 3. Does that matter ?
 - May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?

 Yes.

 Keep in mind, though, that in some implementation (e.g.  Jython), the
 physical address may change during the life time of an object.

 It's usually phrased as a and b are the same object.  If the object
 is mutable, then changing a will also change b.  If a and b aren't
 mutable, then it doesn't really matter whether they share a physical
 address.
 
 That last sentence is not quite true.  intern() is used to ensure that
 strings share a physical address to save memory.

That's a matter of perspective: in my book, the primary advantage of
working with interned strings is that I can use 'is' rather than '=='
to test for equality if I know my strings are interned.  The space
savings are minor; the time savings may be significant.

As others have pointed out, using ``is`` with strings is a Bad Habit
likely leading to nasty, hard-to-find bugs.

intern() costs time, but saves considerable space in any application
with lots of duplicate computed strings (hundreds of megabytes in some
cases).
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Normal is what cuts off your sixth finger and your tail...  --Siobhan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Steven D'Aprano
On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote:

 On 9/5/2012 8:48 AM, Ramchandra Apte wrote:
 
   and a==True should be automatically changed into memory comparison.
 
 I have no idea what that means.

I interpret this as meaning that a == True should be special-cased by 
the interpreter as a is True instead of calling a.__eq__.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Ramchandra Apte
On Thursday, 6 September 2012 12:14:19 UTC+5:30, Steven D'Aprano  wrote:
 On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote:
 
 
 
  On 9/5/2012 8:48 AM, Ramchandra Apte wrote:
 
  
 
and a==True should be automatically changed into memory comparison.
 
  
 
  I have no idea what that means.
 
 
 
 I interpret this as meaning that a == True should be special-cased by 
 
 the interpreter as a is True instead of calling a.__eq__.
 
 
 
 
 
 
 
 -- 
 
 Steven

Steven you are right.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Ramchandra Apte
On Wednesday, 5 September 2012 19:43:30 UTC+5:30, Steven D'Aprano  wrote:
 On Wed, 05 Sep 2012 05:48:26 -0700, Ramchandra Apte wrote:
 
 
 
  Seeing this thread, I think the is statment should be removed. It has a
 
  replacement syntax of id(x) == id(y)
 
 
 
 A terrible idea.
 
 
 
 Because is is a keyword, it is implemented as a fast object comparison 
 
 directly in C (for CPython) or Java (for Jython). In the C implementation 
 
 x is y is *extremely* fast because it is just a pointer comparison 
 
 performed directly by the interpreter.
 
 
 
 Because id() is a function, it is much slower. And because it is not a 
 
 keyword, Python needs to do a name look-up for it, then push the argument 
 
 on the stack, call the function (which may not even be the built-in id() 
 
 any more!) and then pop back to the caller.
 
 
 
 And worst, *it doesn't even do what you think it does*. In some Python 
 
 implementations, IDs can be reused. That leads to code like this, from 
 
 CPython 2.7:
 
 
 
 py id(spam ham[1:]) == id(foo bar[1:])
 
 True
 
 
 
 You *cannot* replace is with id() except when the objects are guaranteed 
 
 to both be alive at the same time, and even then you *shouldn't* replace 
 
 is with id() because that is a pessimation (the opposite of an 
 
 optimization -- something that makes code run slower, not faster).
 
 
 
 
 
  and a==True should be automatically changed into memory comparison.
 
 
 
 Absolutely not. That would be a backward-incompatible change that would 
 
 break existing programs:
 
 
 
 py 1.0 == True
 
 True
 
 py from decimal import Decimal
 
 py Decimal(1.) == True
 
 True
 
 
 
 
 
 
 
 -- 
 
 Steven
the is statement could be made into a function
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Chris Angelico
On Thu, Sep 6, 2012 at 6:26 PM, Ramchandra Apte maniandra...@gmail.com wrote:
 the is statement could be made into a function

It's not a statement, it's an operator; and functions have far more
overhead than direct operators. There's little benefit in making 'is'
into a function, and high cost; unlike 'print', whose cost is
dominated by the cost of producing output to a console or similar
device, 'is' would be dominated by the cost of name lookups and
function call overhead.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Duncan Booth
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

 But less exotically, Frank isn't entirely wrong. With current day 
 computers, it is reasonable to say that any object has exactly one 
 physical location at any time. In Jython, objects can move around; in 
 CPython, they can't. But at any moment, any object has a specific 
 location, and no other object can have that same location. Two objects 
 cannot both be at the same memory address at the same time.
 

It is however perfectly possible for one object to be at two or more memory 
addresses at the same time.

In fact some work being done in PyPy right now is doing exactly that as 
part of Armin Rigo's software transactional memory implementation: when a 
global object is mutated a new copy is made and some threads may see the 
new version while other threads continue to see the old version until their 
transactions are comitted (or aborted). This means that global objects can 
be safely read from multiple threads without any semaphore locking.

See http://mail.python.org/pipermail/pypy-dev/2012-September/010513.html

-- 
Duncan Booth http://kupuguy.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Chris Angelico
On Thu, Sep 6, 2012 at 7:34 PM, Duncan Booth
duncan.booth@invalid.invalid wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

 But at any moment, any object has a specific
 location, and no other object can have that same location. Two objects
 cannot both be at the same memory address at the same time.


 It is however perfectly possible for one object to be at two or more memory
 addresses at the same time.

And of course, memory addresses have to be taken as per-process, since
it's entirely possible for two processes to reuse addresses. But I
think all these considerations of object identity are made with the
assumption that we're working within a single Python process.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Roy Smith
In article 50484643$0$29977$c3e8da3$54964...@news.astraweb.com,
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:

 On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote:
 
  On 9/5/2012 8:48 AM, Ramchandra Apte wrote:
  
and a==True should be automatically changed into memory comparison.
  
  I have no idea what that means.
 
 I interpret this as meaning that a == True should be special-cased by 
 the interpreter as a is True instead of calling a.__eq__.

That would break classes which provide their own __eq__() method.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-06 Thread Ramchandra Apte
On Thursday, 6 September 2012 17:46:38 UTC+5:30, Roy Smith  wrote:
 In article 50484643$0$29977$c3e8da3$54964...@news.astraweb.com,
 
  Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:
 
 
 
  On Wed, 05 Sep 2012 14:27:44 -0400, Terry Reedy wrote:
 
  
 
   On 9/5/2012 8:48 AM, Ramchandra Apte wrote:
 
   
 
 and a==True should be automatically changed into memory comparison.
 
   
 
   I have no idea what that means.
 
  
 
  I interpret this as meaning that a == True should be special-cased by 
 
  the interpreter as a is True instead of calling a.__eq__.
 
 
 
 That would break classes which provide their own __eq__() method.

There is a way of doing this: make True.__req__ = lambda other: self is other
-- 
http://mail.python.org/mailman/listinfo/python-list


is implemented with id ?

2012-09-05 Thread Franck Ditter
Hi !
a is b == id(a) == id(b) in builtin classes.
Is that true ?
Thanks,

franck
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Benjamin Kaplan
On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter fra...@ditter.org wrote:
 Hi !
 a is b == id(a) == id(b) in builtin classes.
 Is that true ?
 Thanks,

 franck

No. It is true that if a is b then id(a) == id(b) but the reverse is
not necessarily true. id is only guaranteed to be unique among objects
alive at the same time. If objects are discarded, their ids may be
reused even though the objects are not the same.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Steven D'Aprano
On Wed, 05 Sep 2012 08:30:31 +0200, Franck Ditter wrote:

 Hi !
 a is b == id(a) == id(b) in builtin classes. Is that true ?

Not just for builtin classes, for any objects, provided that they are 
alive at the same time.

There is no guarantee whether IDs will be re-used. Some versions of 
Python do re-use IDs, e.g. CPython:

steve@runes:~$ python
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) 
[GCC 4.4.5] on linux2
Type help, copyright, credits or license for more information.
 a = [some, object]
 id(a)
3074285228L
 del a
 b = [100, 200]
 id(b)
3074285228L

but others do not, e.g. Jython and IronPython:

steve@runes:~$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) 
[OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18
Type help, copyright, credits or license for more information.
 a = [some, object]
 id(a)
1
 del a
 b = [100, 200]
 id(b)
2


steve@runes:~$ ipy
IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433
Type help, copyright, credits or license for more information.
 a = [some, object]
 id(a)
43
 del a
 b = [100, 200]
 id(b)
44


CPython especially has the most complicated behaviour with IDs and object 
identity: 

 a = 99.99
 b = 99.99
 a is b
False
 a = 99.99; b = 99.99; a is b 
True


In general, you almost never need to care about IDs and object identity. 
The main exception is testing for None, which should always be written as:

if x is None


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Ramchandra Apte
On Wednesday, 5 September 2012 14:44:23 UTC+5:30, Steven D'Aprano  wrote:
 On Wed, 05 Sep 2012 08:30:31 +0200, Franck Ditter wrote:
 
 
 
  Hi !
 
  a is b == id(a) == id(b) in builtin classes. Is that true ?
 
 
 
 Not just for builtin classes, for any objects, provided that they are 
 

Seeing this thread, I think the is statment should be removed.
It has a replacement syntax of id(x) == id(y) and a==True should be 
automatically changed into memory comparison.
 alive at the same time.
 
 
 
 There is no guarantee whether IDs will be re-used. Some versions of 
 
 Python do re-use IDs, e.g. CPython:
 
 
 
 steve@runes:~$ python
 
 Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) 
 
 [GCC 4.4.5] on linux2
 
 Type help, copyright, credits or license for more information.
 
  a = [some, object]
 
  id(a)
 
 3074285228L
 
  del a
 
  b = [100, 200]
 
  id(b)
 
 3074285228L
 
 
 
 but others do not, e.g. Jython and IronPython:
 
 
 
 steve@runes:~$ jython
 
 Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) 
 
 [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18
 
 Type help, copyright, credits or license for more information.
 
  a = [some, object]
 
  id(a)
 
 1
 
  del a
 
  b = [100, 200]
 
  id(b)
 
 2
 
 
 
 
 
 steve@runes:~$ ipy
 
 IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433
 
 Type help, copyright, credits or license for more information.
 
  a = [some, object]
 
  id(a)
 
 43
 
  del a
 
  b = [100, 200]
 
  id(b)
 
 44
 
 
 
 
 
 CPython especially has the most complicated behaviour with IDs and object 
 
 identity: 
 
 
 
  a = 99.99
 
  b = 99.99
 
  a is b
 
 False
 
  a = 99.99; b = 99.99; a is b 
 
 True
 
 
 
 
 
 In general, you almost never need to care about IDs and object identity. 
 
 The main exception is testing for None, which should always be written as:
 
 
 
 if x is None
 
 
 
 
 
 -- 
 
 Steven

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Franck Ditter
Thanks to all, but :
- I should have said that I work with Python 3. Does that matter ?
- May I reformulate the queston : a is b and id(a) == id(b)
  both mean : a et b share the same physical address. Is that True ?
Thanks,

franck

In article mailman.213.1346827305.27098.python-l...@python.org,
 Benjamin Kaplan benjamin.kap...@case.edu wrote:

 On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter fra...@ditter.org wrote:
  Hi !
  a is b == id(a) == id(b) in builtin classes.
  Is that true ?
  Thanks,
 
  franck
 
 No. It is true that if a is b then id(a) == id(b) but the reverse is
 not necessarily true. id is only guaranteed to be unique among objects
 alive at the same time. If objects are discarded, their ids may be
 reused even though the objects are not the same.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Dave Angel
On 09/05/2012 08:48 AM, Ramchandra Apte wrote:
 On Wednesday, 5 September 2012 14:44:23 UTC+5:30, Steven D'Aprano  wrote:
 snip 

 Seeing this thread, I think the is statment should be removed.
 It has a replacement syntax of id(x) == id(y) and a==True should be 
 automatically changed into memory comparison.

You didn't read the whole message carefully enough.  Id's can be reused,
so there are many ways to mess up comparing id's.  One is if the two
items x and y are expressions (eg. function calls).  You call a
function, and say it returns a new object, you call id() on that object,
and then the object gets discarded.  You now have a stale id, and you
haven't even evaluated the second expression yet.

It's id() which is superfluous.  But it's useful for debugging, and for
understanding.



-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Hans Mulder
On 5/09/12 15:19:47, Franck Ditter wrote:
 Thanks to all, but :
 - I should have said that I work with Python 3. Does that matter ?
 - May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?

Yes.

Keep in mind, though, that in some implementation (e.g.
Jython), the physical address may change during the life
time of an object.

It's usually phrased as a and b are the same object.
If the object is mutable, then changing a will also change b.
If a and b aren't mutable, then it doesn't really matter
whether they share a physical address.

Keep in mind that physical addresses can be reused when an
object is destroyed.  For example, in my Python3,


id(math.sqrt(17)) == id(math.cos(17))

returns True, even though the floats involved are different,
because the flaots have non-overlapping lifetimes and the
physical address happens to be reused.


Hope this helps,

-- HansM
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Dave Angel
Please don't top-post.  Now your message is out of order, and if I have
to delete the part Benjamin said.

On 09/05/2012 09:19 AM, Franck Ditter wrote:
 Thanks to all, but :
 - I should have said that I work with Python 3. Does that matter ?
 - May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?
 Thanks,

No, id() has nothing to do with physical address.  The Python language
does not specify anything about physical addresses.  Some
implementations may happen to use physical addresses, others arbitrary
integers.  And they may reuse such integers, or not.  Up to the
implementation.

And as others have pointed out, when you compare two id's, you're
risking that one of them may no longer be valid.  For example, the
following expression:

 flag = id(func1()) == id(func2())

could very well evaluate to True, even if func1() always returns a
string, and func2() always returns an int.  On the other hand, the 'is'
expression makes sure the two expressions are bound to the same object.

If a and b are simple names, and not placeholders for arbitrary
expressions, then I  THINK  the following would be true:

a is b  and   id(a) == id(b)  both mean that the names a and b are
bound to the same object at the time the statement is executed.

-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Steven D'Aprano
On Wed, 05 Sep 2012 05:48:26 -0700, Ramchandra Apte wrote:

 Seeing this thread, I think the is statment should be removed. It has a
 replacement syntax of id(x) == id(y)

A terrible idea.

Because is is a keyword, it is implemented as a fast object comparison 
directly in C (for CPython) or Java (for Jython). In the C implementation 
x is y is *extremely* fast because it is just a pointer comparison 
performed directly by the interpreter.

Because id() is a function, it is much slower. And because it is not a 
keyword, Python needs to do a name look-up for it, then push the argument 
on the stack, call the function (which may not even be the built-in id() 
any more!) and then pop back to the caller.

And worst, *it doesn't even do what you think it does*. In some Python 
implementations, IDs can be reused. That leads to code like this, from 
CPython 2.7:

py id(spam ham[1:]) == id(foo bar[1:])
True

You *cannot* replace is with id() except when the objects are guaranteed 
to both be alive at the same time, and even then you *shouldn't* replace 
is with id() because that is a pessimation (the opposite of an 
optimization -- something that makes code run slower, not faster).


 and a==True should be automatically changed into memory comparison.

Absolutely not. That would be a backward-incompatible change that would 
break existing programs:

py 1.0 == True
True
py from decimal import Decimal
py Decimal(1.) == True
True



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Steven D'Aprano
On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote:

 On 09/05/2012 09:19 AM, Franck Ditter wrote:
 Thanks to all, but :
 - I should have said that I work with Python 3. Does that matter ? -
 May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?
 Thanks,
 
 No, id() has nothing to do with physical address.  The Python language
 does not specify anything about physical addresses.  Some
 implementations may happen to use physical addresses, others arbitrary
 integers.  And they may reuse such integers, or not.  Up to the
 implementation.

True. In principle, some day there might be a version of Python that runs 
on some exotic quantum computer where the very concept of physical 
address is meaningless. Or some sort of peptide or DNA computer, where 
the calculations are performed via molecular interactions rather than by 
flipping bits in fixed memory locations.

But less exotically, Frank isn't entirely wrong. With current day 
computers, it is reasonable to say that any object has exactly one 
physical location at any time. In Jython, objects can move around; in 
CPython, they can't. But at any moment, any object has a specific 
location, and no other object can have that same location. Two objects 
cannot both be at the same memory address at the same time.

So, for current day computers at least, it is reasonable to say that 
a is b implies that a and b are the same object at a single location.

The second half of the question is more complex:

id(a) == id(b) *only* implies that a and b are the same object at the 
same location if they exist at the same time. If they don't exist at the 
same time, then you can't conclude anything.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Dave Angel
On 09/05/2012 10:41 AM, Steven D'Aprano wrote:
 On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote:

 On 09/05/2012 09:19 AM, Franck Ditter wrote:
 Thanks to all, but :
 - I should have said that I work with Python 3. Does that matter ? -
 May I reformulate the queston : a is b and id(a) == id(b)
   both mean : a et b share the same physical address. Is that True ?
 Thanks,
 No, id() has nothing to do with physical address.  The Python language
 does not specify anything about physical addresses.  Some
 implementations may happen to use physical addresses, others arbitrary
 integers.  And they may reuse such integers, or not.  Up to the
 implementation.
 True. In principle, some day there might be a version of Python that runs 
 on some exotic quantum computer where the very concept of physical 
 address is meaningless. Or some sort of peptide or DNA computer, where 
 the calculations are performed via molecular interactions rather than by 
 flipping bits in fixed memory locations.

 But less exotically, Frank isn't entirely wrong. With current day 
 computers, it is reasonable to say that any object has exactly one 
 physical location at any time. In Jython, objects can move around; in 
 CPython, they can't. But at any moment, any object has a specific 
 location, and no other object can have that same location. Two objects 
 cannot both be at the same memory address at the same time.

 So, for current day computers at least, it is reasonable to say that 
 a is b implies that a and b are the same object at a single location.

You're arguing against something i didn't say.  I only said that id()
doesn't promise to be a memory address.  i said nothing about what it
might mean if the is operator considers them the same.

 The second half of the question is more complex:

 id(a) == id(b) *only* implies that a and b are the same object at the 
 same location if they exist at the same time. If they don't exist at the 
 same time, then you can't conclude anything.


But by claiming that id() really means address, and that those addresses
might move during the lifetime of an object, then the fact that the id()
functions are not called simultaneously implies that one object might
move to where the other one used to be before the move.

I don't claim to know the jython implementation.  But you're claiming
that id() means the address of the object, even in jython.  So if a
garbage collection can occur during the evaluation of the expression
   id(a) == id(b)

then the comparing of id()'s would be useless in jython.  Two distinct
objects could each be moved during evaluation, (very) coincidentally
causing the two to have the same addresses at the two times of
evaluation.  Or more likely, a single object could move to a new
location, rendering the comparison false.  Thus you have false positive
and false negative possible.

I think it much more likely that jython uses integer values for the id()
function, and not physical addresses.  I doubt they'd want a race condition.


-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Steven D'Aprano
On Wed, 05 Sep 2012 11:09:30 -0400, Dave Angel wrote:

 On 09/05/2012 10:41 AM, Steven D'Aprano wrote:
[...]
 So, for current day computers at least, it is reasonable to say that a
 is b implies that a and b are the same object at a single location.
 
 You're arguing against something i didn't say. I only said that id()
 doesn't promise to be a memory address.  i said nothing about what it
 might mean if the is operator considers them the same.

I'm not arguing at all. I'm agreeing with you, but going into more detail.


 The second half of the question is more complex:

 id(a) == id(b) *only* implies that a and b are the same object at the
 same location if they exist at the same time. If they don't exist at
 the same time, then you can't conclude anything.


 But by claiming that id() really means address,

I didn't actually say that. If you re-read Franck Ditter's previous post, 
he doesn't actually say that either.


 and that those addresses
 might move during the lifetime of an object, then the fact that the id()
 functions are not called simultaneously implies that one object might
 move to where the other one used to be before the move.

Well, yes, but I expect that implementations where objects can move will 
not use memory addresses as IDs. They will do what Jython and IronPython 
do and use arbitrary numbers as IDs.

(Oh how I wish CPython hadn't used memory addresses as IDs.)


 I don't claim to know the jython implementation.  But you're claiming
 that id() means the address of the object, even in jython.

Good god no! I'm saying that, *if* a and b exist at the same time, *and* 
if id(a) == id(b), *then* a and b must be the same object and therefore 
at the same address. That doesn't mean that the ID is the address!


 I think it much more likely that jython uses integer values for the id()
 function, and not physical addresses.

That's exactly what it does. It appears to be a simple counter: each time 
you ask for an object's ID, it gets allocated the next value starting 
from 1, and values are never re-used.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Tim Chase
On 09/05/12 08:46, Dave Angel wrote:
 It's id() which is superfluous.  But it's useful for debugging,
 and for understanding.

While I assiduously work to eschew shadowing most built-in names
such as list or str, I do make an exception for id because
it's *so* useful in code, and the built-in nets me almost nothing
(well, nothing that I generally care about) that is doesn't
already provide me.

-tkc


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Hans Mulder
On 5/09/12 17:09:30, Dave Angel wrote:
 But by claiming that id() really means address, and that those addresses
 might move during the lifetime of an object, then the fact that the id()
 functions are not called simultaneously implies that one object might
 move to where the other one used to be before the move.

Whoa!  Not so fast!  The id() of an object is guaranteed to not
change during the object's lifetime.  So if an implementation
moves objects around (e.g. Jython), then it cannot use memory
addresses for the id() function.

 I think it much more likely that jython uses integer values for
 the id() function, and not physical addresses.

The id() function is guaranteed to return some flavour of integer.

In Jython, the return values are 1, 2, 3, 4, etc., except, of course,
if you invoke id() on an object you've id'd before, you get the same
number as before.

In current versions of CPython, you do get the (virtual) memory
address, converted to an int (or a long).  But then, CPython does
not move objects.

Maybe the next version of CPython should shift id values two or three
bits to the right, just to make sure people don't misinterpret ids as
memory addresses.


Hope this helps,

-- HansM
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Ian Kelly
On Wed, Sep 5, 2012 at 8:13 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 You *cannot* replace is with id() except when the objects are guaranteed
 to both be alive at the same time, and even then you *shouldn't* replace
 is with id() because that is a pessimation (the opposite of an
 optimization -- something that makes code run slower, not faster).

Shouldn't that be pessimization for symmetry?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Dave Angel
On 09/05/2012 12:47 PM, Hans Mulder wrote:
 On 5/09/12 17:09:30, Dave Angel wrote:
 But by claiming that id() really means address, and that those addresses
 might move during the lifetime of an object, then the fact that the id()
 functions are not called simultaneously implies that one object might
 move to where the other one used to be before the move.
 Whoa!  Not so fast!  The id() of an object is guaranteed to not
 change during the object's lifetime.  So if an implementation
 moves objects around (e.g. Jython), then it cannot use memory
 addresses for the id() function.
Which is equivalent to my point.  I had mistakenly thought that Steven
was claiming it was always an address, and disproving that claim by what
amounts to reductio ad absurdem.

 I think it much more likely that jython uses integer values for
 the id() function, and not physical addresses.
 The id() function is guaranteed to return some flavour of integer.

 In Jython, the return values are 1, 2, 3, 4, etc., except, of course,
 if you invoke id() on an object you've id'd before, you get the same
 number as before.

 In current versions of CPython, you do get the (virtual) memory
 address, converted to an int (or a long).  But then, CPython does
 not move objects.

 Maybe the next version of CPython should shift id values two or three
 bits to the right, just to make sure people don't misinterpret ids as
 memory addresses.



I think i'd prefer if it would put it through one step of a CRC32.

-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Terry Reedy

On 9/5/2012 8:48 AM, Ramchandra Apte wrote:


Seeing this thread, I think the is statment should be removed.
It has a replacement syntax of id(x) == id(y)


The thread is wrong then.

If the implementation reuses ids, which CPython does,
expression-1 is expression-2
must be implemented as

internal-tem1 = expression-1
internal-tem2 = expression-2
id(internal-tem1) == id(internal-tem2)

in order to ensure that the two objects exist simultaneously,
so that the id comparison is valid.

 and a==True should be automatically changed into memory comparison.

I have no idea what that means.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Terry Reedy

On 9/5/2012 10:41 AM, Steven D'Aprano wrote:


True. In principle, some day there might be a version of Python that runs
on some exotic quantum computer where the very concept of physical
address is meaningless.


You mean like the human brain? When people execute Python code, does 0 
have an address?


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: is implemented with id ?

2012-09-05 Thread Dave Angel
On 09/05/2012 02:27 PM, Terry Reedy wrote:
 On 9/5/2012 8:48 AM, Ramchandra Apte wrote:

 Seeing this thread, I think the is statment should be removed.
 It has a replacement syntax of id(x) == id(y)

 The thread is wrong then.

 If the implementation reuses ids, which CPython does,
 expression-1 is expression-2
 must be implemented as

 internal-tem1 = expression-1
 internal-tem2 = expression-2
 id(internal-tem1) == id(internal-tem2)

 in order to ensure that the two objects exist simultaneously,
 so that the id comparison is valid.

  and a==True should be automatically changed into memory comparison.

 I have no idea what that means.


It's probably a response to Steve's comment



In general, you almost never need to care about IDs and object identity. 
The main exception is testing for None, which should always be written as:

if x is None


 Somehow he substituted True for None.  Anyway, if one eliminates is
then Steve's comment wouldn't apply.

-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list