[Cassandra Wiki] Update of "Operations" by JonathanElli s

Apache Wiki Tue, 08 Dec 2009 19:42:16 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=7&rev2=8

--------------------------------------------------

  
  Note that with !RackAwareStrategy, succeeding nodes along the ring should 
alternate data centers to avoid hot spots.  For instance, if you have nodes A, 
B, C, and D in increasing Token order, and instead of alternating you place A 
and B in DC1, and C and D in DC2, then nodes C and A will have 
disproportionately more data on them because they will be the replica 
destination for every Token range in the other data center.
  
- Replication strategy may not be changed without wiping your data and starting 
over.
+ Replication strategy is not intended to be changed after loading data, but it 
can be done if you need to badly enough. The procedure would look something 
like:
+  1. have each node do an anticompaction for its primary range
+  1. manually scp those to the new replica points
+  1. then switch the partitioner
+ 
+ This could be done offline, or online at the cost of introducing some 
temporary inconsistency that could be fixed by repair (see below).
  
  = Adding new nodes =
  Adding new nodes is called "bootstrapping."
@@ -65, +70 @@

   1. Remove the old node from the ring first, or bring up a replacement node 
with the same IP and Token as the old; otherwise, the old node will stay part 
of the ring in a "down" state, which will degrade your replication factor for 
the affected Range
    * If you don't know the Token of the old node, you can retrieve it from any 
of the other nodes' `system` keyspace, !ColumnFamily `LocationInfo`, key `L`.
    * You can also run  `nodeprobe ring `to lookup a node's token (Unless there 
was some kind of outage, and the others came up but not the down one).
-  1. Removing the old node, then bootstrapping the new one, may be more 
performant than using Anti-Entropy.  Testing needed.
+  1. Removing the old node, then bootstrapping the new one, may be more 
performant than using Anti-Entropy (testing needed), and will eliminate 
incorrect answers given by the replacement node while it does not yet have all 
the data for its Range.
-   * Even brute-force rsyncing of data from the relevant replicas and running 
cleanup on the replacement node may be more performant
+   * To test: even brute-force rsyncing of data from the relevant replicas and 
running cleanup on the replacement node may be more performant.
  
  = Backing up data =
  Cassandra can snapshot data while online using `nodeprobe snapshot`.  You can 
then back up those snapshots using any desired system, although leaving them 
where they are is probably the option that makes the most sense on large 
clusters.

[Cassandra Wiki] Update of "Operations" by JonathanElli s

Reply via email to