[
https://issues.apache.org/jira/browse/CASSANDRA-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
C. Scott Andreas updated CASSANDRA-13841:
-----------------------------------------
Labels: 4.0-feature-freeze-review-requested (was: )
> Allow specific sources during rebuild
> -------------------------------------
>
> Key: CASSANDRA-13841
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13841
> Project: Cassandra
> Issue Type: Bug
> Reporter: Kurt Greaves
> Assignee: Kurt Greaves
> Priority: Minor
> Labels: 4.0-feature-freeze-review-requested
>
> CASSANDRA-10406 introduced the ability to rebuild specific ranges, and
> CASSANDRA-9875 extended that to allow specifying a set of hosts to stream
> from. It's not incredibly clear why you would only want to stream a subset of
> ranges, but a possible use case for this functionality is to rebuild a node
> from targeted replicas.
> When doing a DC migration, if you are using racks==RF while rebuilding you
> can ensure you rebuild from each copy of a replica in the source datacenter
> by specifying all the hosts from a single rack to rebuild a single copy from.
> This can be repeated for each rack in the new datacenter to ensure you have
> each copy of the replica from the source DC, and thus maintaining consistency
> through rebuilds.
> For example, with the following topology for DC A and B with an RF of A:3 and
> B:3
> ||A|||| ||B||
> ||Node||Rack||Node||Rack||
> |A1|rack1| B1|rack1|
> |A2|rack2| B2|rack2|
> |A3|rack3| B3|rack3|
> The following set of actions will result in having exactly 1 copy of every
> replica in A in B, and B will be _at least_ as consistent as A.
> {code:java}
> Rebuild B1 from only A1
> Rebuild B2 from only A2
> Rebuild B3 from only A3
> {code}
> Unfortunately using this functionality is non-trivial at the moment, as you
> can only specify specific sources WITH the nodes set of tokens to rebuild
> from. To perform the above with vnodes/a large cluster, you will have to
> specify every token range in the -ts arg, which quickly gets
> unwieldy/impossible if you have a large cluster.
> A solution to this is to simply filter on sources first, before processing
> ranges.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]