[Cassandra Wiki] Update of "Operations" by JonathanEllis

Apache Wiki Sat, 05 Mar 2011 16:51:04 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations" page has been changed by JonathanEllis.
The comment on this change is: update repair section for 0.7.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=82&rev2=83

--------------------------------------------------

  Cassandra repairs data in two ways:
  
   1. Read Repair: every time a read is performed, Cassandra compares the 
versions at each replica (in the background, if a low consistency was requested 
by the reader to minimize latency), and the newest version is sent to any 
out-of-date replicas.
-  1. Anti-Entropy: when `nodetool repair` is run, Cassandra computes a Merkle 
tree of the data on that node, and compares it with the versions on other 
replicas, to catch any out of sync data that hasn't been read recently.  This 
is intended to be run infrequently (e.g., weekly) since computing the Merkle 
tree is relatively expensive in disk i/o and CPU, since it scans ALL the data 
on the machine (but it is is very network efficient).  
+  1. Anti-Entropy: when `nodetool repair` is run, Cassandra computes a Merkle 
tree for each range of data on that node, and compares it with the versions on 
other replicas, to catch any out of sync data that hasn't been read recently.  
This is intended to be run infrequently (e.g., weekly) since computing the 
Merkle tree is relatively expensive in disk i/o and CPU, since it scans ALL the 
data on the machine (but it is is very network efficient).  
  
  Running `nodetool repair`:
- Like all nodetool operations, repair is non-blocking; it sends the command to 
the given node, but does not wait for the repair to actually finish.  You can 
tell that repair is finished when (a) there are no active or pending tasks in 
the CompactionManager, and after that when (b) there are no active or pending 
tasks on o.a.c.concurrent.AE-SERVICE-STAGE, or o.a.c.service.StreamingService.
+ Like all nodetool operations in 0.7, repair is blocking: it will wait for the 
repair to finish and then exit.  This may take a long time on large data sets.
  
- Repair should be run against one machine at a time.  (This limitation will be 
fixed in 0.7.)
+ It is safe to run repair against multiple machines at the same time, but to 
minimize the impact on your application workload it is recommended to wait for 
it to complete on one node before invoking it against the next.
  
  === Frequency of nodetool repair ===

[Cassandra Wiki] Update of "Operations" by JonathanEllis

Reply via email to