[Cassandra Wiki] Update of "Operations" by JonathanElli s

Apache Wiki Tue, 08 Dec 2009 14:55:39 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=2&rev2=3

--------------------------------------------------

  
  Moving is essentially a convenience over decommission + bootstrap.
  
- = Load balancing =
+ == Load balancing ==
  
  Also essentially a convenience over decommission + bootstrap, only instead of 
telling the node where to move on the ring it will choose its location based on 
the same heuristic as Token selection on bootstrap.
  
  = Consistency =
  
- = Repairing missing or inconsistent data =
+ Cassandra allows clients to specify the desired consistency level on reads 
and writes.  (See [[API]].)  If R + W > N, where R, W, and N are respectively 
the read replica count, the write replica count, and the replication factor, 
all client reads will see the most recent write.  Otherwise, readers '''may''' 
see older versions, for periods of typically a few ms; this is called "eventual 
consistency."  See 
[[http://www.allthingsdistributed.com/2008/12/eventually_consistent.html]] and 
[[http://queue.acm.org/detail.cfm?id=1466448]] for more.
  
+ == Repairing missing or inconsistent data ==
+ 
+ Cassandra repairs data in two ways:
+ 
+  1. Read Repair: every time a read is performed, Cassandra compares the 
versions at each replica (in the background, if a low consistency was requested 
by the reader to minimize latency), and the newest version is sent to any 
out-of-date replicas.
+  1. Anti-Entropy: when `nodeprobe repair` is run, Cassandra performs a major 
compaction, computes a Merkle Tree of the data on that node, and compares it 
with the versions on other replicas, to catch any out of sync data that hasn't 
been read recently.  This is intended to be run infrequently (e.g., weekly) 
since major compaction is relatively expensive.
+ 
+ == Handling failure ==
+ 
+ If a node goes down and comes back up, the ordinary repair mechanisms will be 
adequate to deal with any inconsistent data.  If a node goes down entirely, you 
should be aware of the following as well:
+  1. Remove the old node from the ring first, or bring up a replacement node 
with the same IP and Token as the old; otherwise, the old node will stay part 
of the ring in a "down" state, which will degrade your replication factor for 
the affected Range
+   * If you don't know the Token of the old node, you can retrieve it from any 
of the other nodes' `system` keyspace, ColumnFamily `LocationInfo`, key `L`.
+  1. Removing the old node, then bootstrapping the new one, may be more 
performant than using Anti-Entropy.  Testing needed.
+   * Even brute-force rsyncing of data from the relevant replicas and running 
cleanup on the replacement node may be more performant
+ 
+ = Backing up data =
+ 
+ Cassandra can snapshot data while online using `nodeprobe snapshot`.  You can 
then back up those snapshots using any desired system, although leaving them 
where they are is probably the option that makes the most sense on large 
clusters.
+ 
+ Currently, only flushed data is snapshotted (not data that only exists in the 
commitlog).  Run `nodeprobe flush` first and wait for that to complete, to make 
sure you get '''all''' data in the snapshot.
+ 
+ To revert to a snapshot, shut down the node, clear out the old commitlog and 
sstables, and move the sstables from the snapshot location to the live data 
directory.
+ 
+ == Import / export ==
+ 
+ Cassandra can also export data as JSON with `bin/sstable2json`, and import it 
with `bin/json2sstable`.  Eric to document. :)
+ 
+ = Monitoring =
+ 
+ Cassandra exposes internal metrics as JMX data.  This is a common standard in 
the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX 
support.
+ 
+ Chris to describe some important metrics to watch
+

[Cassandra Wiki] Update of "Operations" by JonathanElli s

Reply via email to