[
https://issues.apache.org/jira/browse/CASSANDRA-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sankalp kohli updated CASSANDRA-4650:
-------------------------------------
Status: Patch Available (was: Reopened)
> RangeStreamer should be smarter when picking endpoints for streaming in case
> of N >=3 in each DC.
> ---------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-4650
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4650
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.1.5
> Reporter: sankalp kohli
> Assignee: sankalp kohli
> Priority: Minor
> Labels: streaming
> Attachments: CASSANDRA-4650_trunk.txt, photo-1.JPG
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> getRangeFetchMap method in RangeStreamer should pick unique nodes to stream
> data from when number of replicas in each DC is three or more.
> When N>=3 in a DC, there are two options for streaming a range. Consider an
> example of 4 nodes in one datacenter and replication factor of 3.
> If a node goes down, it needs to recover 3 ranges of data. With current code,
> two nodes could get selected as it orders the node by proximity.
> We ideally will want to select 3 nodes for streaming the data. We can do this
> by selecting unique nodes for each range.
> Advantages:
> This will increase the performance of bootstrapping a node and will also put
> less pressure on nodes serving the data.
> Note: This does not affect if N < 3 in each DC as then it streams data from
> only 2 nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)