Terry J. Reedy tjre...@udel.edu added the comment:
Chema, thank you for bringing this semi-conscious assumption to light.
Intentionally breaking it significantly speeds up set and dict creation.
--
nosy: +tjreedy
resolution: - invalid
status: open - closed
superseder: - Reduce hash
Jesús Cea Avión j...@jcea.es added the comment:
Marc, please post the bugid for the hash3 issue. It is interesting
enough to pursue it.
--
resolution: invalid -
status: closed - open
___
Python tracker rep...@bugs.python.org
Mark Dickinson dicki...@gmail.com added the comment:
See issue 5186 for using id()/8 for the hash.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5169
___
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
Tim, any thoughts?
--
assignee: - tim_one
nosy: +tim_one
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5169
___
New submission from Chema Cortés dev.xt...@gmail.com:
Sometimes, the default hash for user-defined object is not equal to the
id of the object:
In [1]: class A:
...: pass
In [2]: a=A()
In [3]: id(a),hash(a)
Out[3]: (3082955212L, -1212012084)
The test box has an AMD Sempron, a 64bit CPU
Antoine Pitrou pit...@free.fr added the comment:
I wouldn't qualify this as a bug. hash() doesn't need to be equal to the
id() even in the default case.
Actually, it may be better for hash() to be equal to id()/4 or id()/8,
depending on the standard alignment of the memory allocator.
--
Mark Dickinson dicki...@gmail.com added the comment:
It looks like this is a platform with sizeof(long) == 4 and sizeof(void *)
== 8. Is that right? As Antoine says, I can't see any problem here. Why
do you think that hash(a) should be equal to id(a) in this case?
Antoine, in what way
Antoine Pitrou pit...@free.fr added the comment:
Because with hash() == id() == address of the PyObject, the hash is
always a multiple of 4 or 8 (I think it's 8), so (hash() %
dict_or_set_table_size) is non-uniformly distributed in most cases.
___
Python
Mark Dickinson dicki...@gmail.com added the comment:
Hah. Good point. I'd forgotten about the taking-the-low-order-bits
thing. Should probably do some timings (and possibly also number-of-
collisions measurements) to find out whether using id() 3 actually
makes any significant difference
Mark Dickinson dicki...@gmail.com added the comment:
Some preliminary timings indicate that it may well be worth replacing 'return
(long)p' with
'return (long)p 3' in _Py_HashPointer (in Objects/object.c): I'm getting a
10% speedup in
dict-building and dict-lookup for dicts of plain
Jesús Cea Avión j...@jcea.es added the comment:
The issue is trivially reproductible in any 32 bits platform, simply
allocating objects until you go up the 2GB mark.
Since __hash__() wants to take advantage of every bit in a 32 bit
platform, and we don't have unsigned integers in python, I vote
Chema Cortés dev.xt...@gmail.com added the comment:
I also agree to close this bug as invalid. Indeed, there is not any
reason to make equal id(a) and hash(a), but the description of
hashable object from the documentation (but this is a different
issue).
'hash' and 'id' returns the
12 matches
Mail list logo