On 29 January 2010 16:54, Kent Johnson <ken...@tds.net> wrote: > On Fri, Jan 29, 2010 at 8:03 AM, spir <denis.s...@free.fr> wrote: > >> I recently discovered that Lua uses the data's address (read: id) as input >> to the hash func. This allows Lua tables (a kind of more versatile >> associative array) to use _anything_ as key, since the id is guaranteed not >> to change, per definition. >> [Aside this consideration, hashing addresses ensures a constant type as >> input (so allows a specific efficient hash method) and the simplest possible >> one.] > > This sounds like a bad idea to me. You generally want key lookup to be > based on value, not identity. For example, if I say > > d = dict() > d['key'] = value > > and later > print d['key'] > > I want this to print 'value' regardless of whether I use the same > instance of the string 'key' in both cases. > > Maybe Lua has some mechanism for ensuring that this can't happen? > > In general, for a hash table to work as expected, two keys that are > equal in value should have the same hash value. Here is an explanation > of why (for Java, but it pretty much applies to any hash table > implementation): > http://www.javaworld.com/javaqa/2002-06/01-qa-0621-hashtable.html > > Kent > _______________________________________________ > Tutor maillist - tu...@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor >
I've played with this a little. The following class was quite handy for this: class BrokenHash(object): def __init__(self, hashval): self.hashval = hashval def __hash__(self): return self.hashval It basically implements the python hashable interface, but in a manner that is "broken" - it allows you control over what the object points to. >From this, we can tell that chaning the value of an object's hash changes its "bucket": >>> b = BrokenHash(1) >>> d = {b:1} >>> b.hashval=2 >>> d[b] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: <__main__.BrokenHash object at 0x657d70> But it's more than just the objects hash: >>> b2 = BrokenHash(hash("foo")) >>> hash(b2) == hash("foo") True >>> d2 = {b2: "bar", "foo": "baz"} >>> print d2 {<__main__.BrokenHash object at 0x657e10>: 'bar', 'foo': 'baz'} (If they both fell into the same bucket, d2['foo'] would overwrite d2[b2] It appears to be some combination of identity and hash. It is not, however, dependant on type: >>> b3 = BrokenHash(hash("foo")) #same hash as b2 >>> d2[b3] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: <__main__.BrokenHash object at 0x657e90> Don't know if that's helped at all. -- Rich "Roadie Rich" Lovely Just because you CAN do something, doesn't necessarily mean you SHOULD. In fact, more often than not, you probably SHOULDN'T. Especially if I suggested it. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor