spir wrote:
On Fri, 29 Jan 2010 08:23:37 -0800
Emile van Sebille <em...@fenx.com> wrote:

So, how does python do this?
Start here...

http://effbot.org/zone/python-hash.htm

Great, thank you!
From the above pointed page:

====For ordinary integers, the hash value is simply the integer itself (unless 
it’s -1).

class int:
    def __hash__(self):
        value =elf
        if value =-1:
            value =-2
        return value
====
I'm surprised of this, for this should create as many indexes (in the underlying array 
actually holding the values) as there are integer keys. With possibly huge holes in the 
array. Actually, there will certainly be a predefined number of indexes N, and the 
integers be further "modulo-ed" N. Or what?
I would love to know how to sensibly chose the number of indexes. Pointers 
welcome (my searches did not bring any clues on the topic).

Emile

Denis
________________________________

la vita e estrany

http://spir.wikidot.com/

I haven't seen the sources, so I'm making an educated guess based on things I have seen. The index size grows as the size of the dictionary grows, and the formula is not linear. Neither are the sizes obvious powers of two or suchlike. I doubt if you have any control over it, however. The hash() function returns an int (32 bits on 32bit python), which is then converted to the bucket number, probably by a simple modulo function.

In the case of integers, it's the modulo which distributes the integers among the buckets. If all the integer keys are consecutive, then modulo distributes them perfectly. If they're random, then it'll usually work pretty well, but you could hit a pattern which puts lots of values in one bucket, and not many in the others. If the index size is 22, and all your numbers are multiple of 22, then it might degenerate to effectively one bucket.

BTW, the referenced article does have a contradiction. For a long int whose value is between 16 and 31 bits, the described approach will not generate the same hash as the int of the same value. So that 15 bit shift algorithm must have some other subtlety to it, perhaps only starting with bit 31 or so.

DaveA


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to