[
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334109#comment-15334109
]
Paulo Motta commented on CASSANDRA-12015:
-----------------------------------------
{{getAllRangesWithSourcesFor}} is only used for bootstrap when
{{cassandra.consistent.rangemovement=false}}, otherwise
{{getAllRangesWithStrictSourcesFor}} is used (which tries to stream from
sources which will lose ranges to the bootstrapping node).
When {{cassandra.consistent.rangemovement=false}} it doesn't really matter from
which replica you pick from, so I guess we're safe to move away from
latency-based proximity. This is also used for replace, so I think it can also
distribute replace/non-consistent-bootstrap load more evenly on that case,
because right now we are prioritizing replicas which have a better dynamic
snitch score, what will probably overload them with streaming originating from
rebuild/replace/non-consistent-bootstrap.
> Rebuilding from another DC should use different sources
> -------------------------------------------------------
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster,
> only one node in DC1 is streaming the data to DC2.
> To build the new DC in a reasonable time, it would be better, in that case,
> to stream from multiple sources, thus distributing more evenly the load.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)