[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334109#comment-15334109
 ] 

Paulo Motta commented on CASSANDRA-12015:
-----------------------------------------

{{getAllRangesWithSourcesFor}} is only used for bootstrap when 
{{cassandra.consistent.rangemovement=false}}, otherwise 
{{getAllRangesWithStrictSourcesFor}} is used (which tries to stream from 
sources which will lose ranges to the bootstrapping node).

When {{cassandra.consistent.rangemovement=false}} it doesn't really matter from 
which replica you pick from, so I guess we're safe to move away from 
latency-based proximity. This is also used for replace, so I think it can also 
distribute replace/non-consistent-bootstrap load more evenly on that case, 
because right now we are prioritizing replicas which have a better dynamic 
snitch score, what will probably overload them with streaming originating from 
rebuild/replace/non-consistent-bootstrap.

> Rebuilding from another DC should use different sources
> -------------------------------------------------------
>
>                 Key: CASSANDRA-12015
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to