Brad Fitzpatrick wrote:
On Wed, 8 Aug 2007, Alex Stapleton wrote:

Brad Fitzpatrick wrote:

   -- what is the input to the digest algorithm for points 1-n?
         e.g.:
         "127.0.0.1:11211-1"
         "127.0.0.1:11211-2"
         "127.0.0.1:11211-3"
                 ....
         "127.0.0.1:11211-<n>"
It seems like running multiple hashes over the string seems like a bit
of a waste of cycles. Modifying the hash value for each point would be
much more efficient and simpler to implement. Something like point value
= <hash value> + (<point number> * 2654435761)

Alex,

I don't think that works well.  Consider:

     item A with weight 1.
     item B with weight 1.

When hashing item A onto the 32-bit continuum you end up with, say, 10.
When hashing item B onto the 32-bit continuum you end up with, say, 11.

Now, because of this one unlucky distribution where A only has 1 unit of
space between it and B, you're not saved by the multiple points.

Now item A's second point is at 2654435761+10, and item B is at
2654435761+11, etc...


Yes thats a good point. I have a method that doesn't have that issue however I can't really think of an implementation that doesn't rely on integer overflow semantics which could be a problem if trying to do implementations in some languages, e.g. Perl, Python, PHP.

If we want to make life easy for people implementing clients in languages which don't have a simple way to use C style integers then perhaps doing what you initially suggested is the best approach after all my whining? :)

Reply via email to