Load balancing
--------------

                 Key: CASSANDRA-192
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
             Project: Cassandra
          Issue Type: New Feature
            Reporter: Jonathan Ellis
             Fix For: 0.4


We need to be able to spread load evenly across a cluster to mitigate keys not 
being uniformly distributed as well as heterogeneous nodes in a cluster.  The 
former is particularly likely to be a problem when using the 
OrderPreservingPartitioner, since the keys are not randomized by a hash 
function.

Avinash suggested three papers on load balancing in this thread: 
http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient 
Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and 
Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing 
in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple 
Load Balancing for Distributed Hash Tables by John Byers et al) is not 
applicable to Cassandra's design.  ("First, we suggest the direct application 
of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is 
stored at the less loaded of two (or more) random alternatives. We then 
consider how associating a small constant number of hash values with a key can 
naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to