[jira] Issue Comment Edited: (CASSANDRA-192) Load balancing

Jonathan Ellis (JIRA) Tue, 26 May 2009 12:59:22 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713079#action_12713079
 ]


Jonathan Ellis edited comment on CASSANDRA-192 at 5/26/09 12:56 PM:
--------------------------------------------------------------------

The Karger/Ruhl paper (really  Ruhl -- based on his thesis) gives two load 
balancing algorithms.  One is based again on each machine having several 
virtual nodes, but the load balance is done by only activating one node per 
machine.  Each machine picks its node based on how evenly it partitions the 
address space.  This would be easy to implement in Cassandra for our random 
hash-based partitioner (since only one node is active at a time, changing nodes 
maps essentially to a token change in Cassandra with no further changes 
necessary) but does not help order-preserving partitioning where we cannot tell 
how evenly the address space (same as the key space) is partitioned.

The second Ruhl algorithm assumes neither the ability to measure address space 
nor virtual nodes.  The summary is brief and I will excerpt it here:

---
The protocol is the following (where ε is any constant, 0 < ε < 1). Recall that 
each node stores the items whose addresses fall between the node's address and 
its predecessor's address, and that j denotes the load on node j.

Item balancing: Each node i occasionally contacts another node j at random. If 
i ≤ εj or j ≤ εi then the nodes perform a load balancing operation (assume that 
i > j), distinguishing two cases:

Case 1: i = j + 1: In this case, i is the successor of j and the two nodes 
handle address intervals next to each other. Node j increases its address so 
that the (i − j)/2 items with lowest addresses get reassigned from node i to 
node j. Both nodes end up with load (i + j)/2.

Case 2: i != j + 1: If j+1 > i, then we set i := j + 1 and go to case 1. 
Otherwise, node j moves between nodes i − 1 and i to capture half of node i's 
items. This means that node j's items are now handled by its former successor, 
node j + 1.
---

This seems like the most promising option so far.  I will do a citation search 
on this paper to see if anything else interesting turns up.

      was (Author: jbellis):
    The Karger/Ruhl paper (really  Ruhl -- based on his thesis) gives two load 
balancing algorithms.  One is based again on each machine having several 
virtual nodes, but the load balance is done by only activating one node per 
machine.  Each machine picks its node based on how evenly it partitions the 
address space.  This would be easy to implement in Cassandra for our random 
hash-based partitioner (since only one node is active at a time, changing nodes 
maps essentially to a token change in Cassandra with no further changes 
necessary) but does not help order-preserving partitioning where we cannot tell 
how evenly the address space (same as the key space) is partitioned.

The second Ruhl algorithm assumes neither the ability to measure address space 
nor virtual nodes.  The summary is brief and I will excerpt it here:

---
The protocol is the following (where ε is any constant, 0 < ε < 1). Recall that 
each node stores the items whose addresses fall between the node's address and 
its predecessor's address, and that j denotes the load on node j.

Item balancing: Each node i occasionally contacts another node j at random. If 
i ≤ εj or j ≤ εi then the nodes perform a load balancing operation (assume that 
i > j), distinguishing two cases:

Case 1: i = j + 1: In this case, i is the successor of j and the two nodes 
handle address intervals next to each other. Node j increases its address so 
that the (i − j)/2 items with lowest addresses get reassigned from node i to 
node j. Both nodes end up with load (i + j)/2.

Case 2: i = j + 1: If j+1 > i, then we set i := j + 1 and go to case 1. 
Otherwise, node j moves between nodes i − 1 and i to capture half of node i's 
items. This means that node j's items are now handled by its former successor, 
node j + 1.
---

This seems like the most promising option so far.  I will do a citation search 
on this paper to see if anything else interesting turns up.
  
> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys 
> not being uniformly distributed as well as heterogeneous nodes in a cluster.  
> The former is particularly likely to be a problem when using the 
> OrderPreservingPartitioner, since the keys are not randomized by a hash 
> function.
> Avinash suggested three papers on load balancing in this thread: 
> http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient 
> Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and 
> Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load 
> Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple 
> Load Balancing for Distributed Hash Tables by John Byers et al) is not 
> applicable to Cassandra's design.  ("First, we suggest the direct application 
> of the 'power of two choices' paradigm, whereby an item is stored at the less 
> loaded of two (or more) random alternatives. We then consider how associating 
> a small constant number of hash values with a key can naturally be extended 
> to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-192) Load balancing

Reply via email to