Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "Operations" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=7&rev2=8

--------------------------------------------------

  
  Note that with !RackAwareStrategy, succeeding nodes along the ring should 
alternate data centers to avoid hot spots.  For instance, if you have nodes A, 
B, C, and D in increasing Token order, and instead of alternating you place A 
and B in DC1, and C and D in DC2, then nodes C and A will have 
disproportionately more data on them because they will be the replica 
destination for every Token range in the other data center.
  
- Replication strategy may not be changed without wiping your data and starting 
over.
+ Replication strategy is not intended to be changed after loading data, but it 
can be done if you need to badly enough. The procedure would look something 
like:
+  1. have each node do an anticompaction for its primary range
+  1. manually scp those to the new replica points
+  1. then switch the partitioner
+ 
+ This could be done offline, or online at the cost of introducing some 
temporary inconsistency that could be fixed by repair (see below).
  
  = Adding new nodes =
  Adding new nodes is called "bootstrapping."
@@ -65, +70 @@

   1. Remove the old node from the ring first, or bring up a replacement node 
with the same IP and Token as the old; otherwise, the old node will stay part 
of the ring in a "down" state, which will degrade your replication factor for 
the affected Range
    * If you don't know the Token of the old node, you can retrieve it from any 
of the other nodes' `system` keyspace, !ColumnFamily `LocationInfo`, key `L`.
    * You can also run  `nodeprobe ring `to lookup a node's token (Unless there 
was some kind of outage, and the others came up but not the down one).
-  1. Removing the old node, then bootstrapping the new one, may be more 
performant than using Anti-Entropy.  Testing needed.
+  1. Removing the old node, then bootstrapping the new one, may be more 
performant than using Anti-Entropy (testing needed), and will eliminate 
incorrect answers given by the replacement node while it does not yet have all 
the data for its Range.
-   * Even brute-force rsyncing of data from the relevant replicas and running 
cleanup on the replacement node may be more performant
+   * To test: even brute-force rsyncing of data from the relevant replicas and 
running cleanup on the replacement node may be more performant.
  
  = Backing up data =
  Cassandra can snapshot data while online using `nodeprobe snapshot`.  You can 
then back up those snapshots using any desired system, although leaving them 
where they are is probably the option that makes the most sense on large 
clusters.

Reply via email to