[ 
https://issues.apache.org/jira/browse/CASSANDRA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309190#comment-14309190
 ] 

Benedict commented on CASSANDRA-7032:
-------------------------------------

Before going ahead and integrating it, could we confirm that the algorithm 
performs well under heterogenous data centre / rack configurations, by which I 
mean different numbers of nodes in each data centre / rack (not differing 
numbers of vnodes), for varying numbers of data centres and racks? And that it, 
most importantly, evenly distributes load across the disks within any single 
node?

It would be great to perform a simulation of a wide range of state spaces, 
along each dimension of: 

# number of data centres
# size variability of data centres
# number of disks per node
# variability of number of disks (perhaps only varying between data centres)
# number of racks within a data centre

And graph the variability of ownership _by disk_. Getting balanced disks is the 
really hard thing to do, and is really essential for CASSANDRA-6696. Data 
centres may also behave unexpectedly for replication; you may own an even 
amount of the ring, but it may translate into a disproportionately contiguous 
range within the data centre, so you end up owning an uneven quantity within 
the data centre.

> Improve vnode allocation
> ------------------------
>
>                 Key: CASSANDRA-7032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7032
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>              Labels: performance, vnodes
>             Fix For: 3.0
>
>         Attachments: TestVNodeAllocation.java, TestVNodeAllocation.java, 
> TestVNodeAllocation.java, TestVNodeAllocation.java, TestVNodeAllocation.java, 
> TestVNodeAllocation.java
>
>
> It's been known for a little while that random vnode allocation causes 
> hotspots of ownership. It should be possible to improve dramatically on this 
> with deterministic allocation. I have quickly thrown together a simple greedy 
> algorithm that allocates vnodes efficiently, and will repair hotspots in a 
> randomly allocated cluster gradually as more nodes are added, and also 
> ensures that token ranges are fairly evenly spread between nodes (somewhat 
> tunably so). The allocation still permits slight discrepancies in ownership, 
> but it is bound by the inverse of the size of the cluster (as opposed to 
> random allocation, which strangely gets worse as the cluster size increases). 
> I'm sure there is a decent dynamic programming solution to this that would be 
> even better.
> If on joining the ring a new node were to CAS a shared table where a 
> canonical allocation of token ranges lives after running this (or a similar) 
> algorithm, we could then get guaranteed bounds on the ownership distribution 
> in a cluster. This will also help for CASSANDRA-6696.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to