Load balancing
--------------
Key: CASSANDRA-192
URL: https://issues.apache.org/jira/browse/CASSANDRA-192
Project: Cassandra
Issue Type: New Feature
Reporter: Jonathan Ellis
Fix For: 0.4
We need to be able to spread load evenly across a cluster to mitigate keys not
being uniformly distributed as well as heterogeneous nodes in a cluster. The
former is particularly likely to be a problem when using the
OrderPreservingPartitioner, since the keys are not randomized by a hash
function.
Avinash suggested three papers on load balancing in this thread:
http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
Of these, the useful ones are
http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient
Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and
Matthias Ruhl)
http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing
in Structured P2P Systems by Ananth Rao et al)
The third,
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple
Load Balancing for Distributed Hash Tables by John Byers et al) is not
applicable to Cassandra's design. ("First, we suggest the direct application
of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is
stored at the less loaded of two (or more) random alternatives. We then
consider how associating a small constant number of hash values with a key can
naturally be extended to support other load balancing strategies.")
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.