[Cassandra Wiki] Update of "Operations" by JonathanElli s

Apache Wiki Tue, 08 Dec 2009 22:03:38 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=7&rev2=8

--------------------------------------------------

  
  Replication strategy may not be changed without wiping your data and starting 
over.
  
- = Adding new nodes =
+ == Network topology ==
+ 
+ Besides datacenters, you can also tell Cassandra which nodes are in the same 
rack within a datacenter.  Cassandra will use this to route both reads and data 
movement for Range changes to the nearest replicas.  This is configured by a 
user-pluggable !EndpointSnitch class in the configuration file.
+ 
+ !EndpointSnitch is related to, but distinct from, replication strategy 
itself: !RackAwareStrategy needs a properly configured Snitch to places 
replicas correctly, but even absent a Strategy that cares about datacenters, 
the rest of Cassandra will still be location-sensitive.
+ 
+ There is an example of a custom Snitch implementation in 
https://svn.apache.org/repos/asf/incubator/cassandra/trunk/contrib/property_snitch/.
+ 
+ = Range changes =
+ 
+ == Bootstrap ==
  Adding new nodes is called "bootstrapping."
  
  To bootstrap a node, turn !AutoBootstrap on in the configuration file, and 
start it.
  
- If you explicitly specify an !InitialToken in the configuration, the new node 
will bootstrap to that position on the ring.  Otherwise, it will pick a Token 
that will give it half the keys from the node with the most disk space used, 
that does not already have another node boostrapping into its Range.
+ If you explicitly specify an !InitialToken in the configuration, the new node 
will bootstrap to that position on the ring.  Otherwise, it will pick a Token 
that will give it half the keys from the node with the most disk space used, 
that does not already have another node bootstrapping into its Range.
  
  Important things to note:
  
@@ -39, +49 @@

   1. Automatically picking a Token only allows doubling your cluster size at 
once; for more than that, let the first group finish before starting another.
   1. As a safety measure, Cassandra does not automatically remove data from 
nodes that "lose" part of their Token Range to a newly added node.  Run 
"nodeprobe cleanup" on the source node(s) when you are satisfied the new node 
is up and working. If you do not do this the old data will still be counted 
against the load on that node and future bootstrap attempts at choosing a 
location will be thrown off.
  
+ Cassandra is smart enough to transfer data from the nearest source node(s), 
if your !EndpointSnitch is configured correctly.  So, the new node doesn't need 
to be in the same datacenter as the primary replica for the Range it is 
bootstrapping into, as long as another replica is in the datacenter with the 
new one.
+ 
- = Removing nodes entirely =
+ == Removing nodes entirely ==
  You can take a node out of the cluster with `nodeprobe decommission.`  The 
node must be live at decommission time (until CASSANDRA-564 is done).
  
  Again, no data is removed automatically, so if you want to put the node back 
into service and you don't need the data on it anymore, it should be removed 
manually.
  
- = Moving nodes =
+ == Moving nodes ==
- Moving is essentially a convenience over decommission + bootstrap.
+ `nodeprobe move`: move the target node to to a given Token. Moving is 
essentially a convenience over decommission + bootstrap.
  
  == Load balancing ==
- Also essentially a convenience over decommission + bootstrap, only instead of 
telling the node where to move on the ring it will choose its location based on 
the same heuristic as Token selection on bootstrap.
+ `nodeprobe loadbalance`: also essentially a convenience over decommission + 
bootstrap, only instead of telling the target node where to move on the ring it 
will choose its location based on the same heuristic as Token selection on 
bootstrap.
  
  = Consistency =
  Cassandra allows clients to specify the desired consistency level on reads 
and writes.  (See [[API]].)  If R + W > N, where R, W, and N are respectively 
the read replica count, the write replica count, and the replication factor, 
all client reads will see the most recent write.  Otherwise, readers '''may''' 
see older versions, for periods of typically a few ms; this is called "eventual 
consistency."  See 
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html and 
http://queue.acm.org/detail.cfm?id=1466448 for more.

[Cassandra Wiki] Update of "Operations" by JonathanElli s

Reply via email to