[ 
https://issues.apache.org/jira/browse/CASSANDRA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309403#comment-14309403
 ] 

Benedict commented on CASSANDRA-7032:
-------------------------------------

Sounds good. I just wanted clarification these things were being explicitly 
addressed. It would be helpful still to simulate these characteristics 
explicitly now, though, and see how well it actually turns out don't you think? 
So we can get numbers on how well it will actually perform at various ends of 
the spectrum.

I still have a nagging suspicion there may be subtelties in mapping DCs into 
the single address space without error, but that's a separate issue I guess, to 
be addressed as we tie it in.

I note I don't have my heart set on any mechanism, so long as we can achieve 
the balance, and it does sound like your approach will generalise well.

The thing I like about this is we can actually add extra "nodes" if we expand 
the number of disks on a node. We will need to think carefully about how we 
actually expose this node mapping to the user, in general, though, to ensure it 
is easy to reason about and robust.

> Improve vnode allocation
> ------------------------
>
>                 Key: CASSANDRA-7032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7032
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>              Labels: performance, vnodes
>             Fix For: 3.0
>
>         Attachments: TestVNodeAllocation.java, TestVNodeAllocation.java, 
> TestVNodeAllocation.java, TestVNodeAllocation.java, TestVNodeAllocation.java, 
> TestVNodeAllocation.java
>
>
> It's been known for a little while that random vnode allocation causes 
> hotspots of ownership. It should be possible to improve dramatically on this 
> with deterministic allocation. I have quickly thrown together a simple greedy 
> algorithm that allocates vnodes efficiently, and will repair hotspots in a 
> randomly allocated cluster gradually as more nodes are added, and also 
> ensures that token ranges are fairly evenly spread between nodes (somewhat 
> tunably so). The allocation still permits slight discrepancies in ownership, 
> but it is bound by the inverse of the size of the cluster (as opposed to 
> random allocation, which strangely gets worse as the cluster size increases). 
> I'm sure there is a decent dynamic programming solution to this that would be 
> even better.
> If on joining the ring a new node were to CAS a shared table where a 
> canonical allocation of token ranges lives after running this (or a similar) 
> algorithm, we could then get guaranteed bounds on the ownership distribution 
> in a cluster. This will also help for CASSANDRA-6696.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to