> Running nodetool repair causes Cassandra to execute a major compaction
This is not what I would call factually accurate. Repair does not run a major 
compaction. Major compaction is when all SSTables for a CF are compacted down 
to one SSTable. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 Jul 2011, at 10:09, cbert...@libero.it wrote:

>> The book is wrong, at least by current versions of Cassandra (I'm
>> basing that on the quote you pasted, I don't know the context).
> 
> To be sure that I didn't misunderstand (English is not my mother tongue) here 
> is what the entire "repair paragraph" says ...
> 
> Basic Maintenance
> There are a few tasks that you’ll need to perform before or after more 
> impactful tasks.
> For example, it makes sense to take a snapshot only after you’ve performed a 
> flush. So
> in this section we look at some of these basic maintenance tasks: repair, 
> snapshot, and
> cleanup.
> 
> Repair
> Running nodetool repair causes Cassandra to execute a major compaction. A 
> Merkle
> tree of the data on the target node is computed, and the Merkle tree is 
> compared with
> those of other replicas. This step makes sure that any data that might be out 
> of sync
> with other nodes isn’t forgotten.
> During a major compaction (see “Compaction” in the Glossary), the server 
> initiates a
> TreeRequest/TreeReponse conversation to exchange Merkle trees with neighboring
> nodes. The Merkle tree is a hash representing the data in that column family. 
> If the
> trees from the different nodes don’t match, they have to be reconciled (or 
> “repaired”)
> in order to determine the latest data values they should all be set to. This 
> tree compar-
> ison validation is the responsibility of the org.apache.cassandra.service.
> AntiEntropy
> Service class. AntiEntropyService implements the Singleton pattern and 
> defines 
> the
> static Differencer class as well, which is used to compare two trees. If it 
> finds any
> differences, it launches a repair for the ranges that don’t agree.
> So although Cassandra takes care of such matters automatically on occasion, 
> you can
> run it yourself as well.
> 
> 
> 
>> 
>> nodetool repair must be scheduled by the operator to run regularly.
>> The name "repair" is a bit unfortunate; it is not meant to imply that
>> it only needs to run when something is "wrong".
>> 
>> -- 
>> / Peter Schuller
>> 
> 
> 

Reply via email to