Node repair questions
Hello, Have the following questions related to nodetool repair: 1. I know that Nodetool Repair Interval has to be less than GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds and 'Nodetool Repair Interval'. What factors would want me to change the default of 10 days of GCGraceSeconds. Similarly what factors would want me to keep Nodetool Repair Interval to be just slightly less than GCGraceSeconds (say a day less). 2. Does a Nodetool Repair block any reads and writes on the node, while the repair is going on ? During repair, if I try to do an insert, will the insert wait for repair to complete first ? 3. I read that repair can impact your workload as it causes additional disk and cpu activity. But any details of the impact mechanism and any ballpark on how much the read/write performance deteriorates ? Thanks.
RE: Node repair questions
The more often you repair, the quicker it will be. The more often your nodes go down the longer it will be. Repair streams data that is missing between nodes. So the more data that is different the longer it will take. Your workload is impacted because the node has to scan the data it has to be able to compare with other nodes, and if there are differences, it has to send/receive data from other nodes. -Original Message- From: A J [mailto:s5a...@gmail.com] Sent: Monday, July 11, 2011 2:43 PM To: user@cassandra.apache.org Subject: Node repair questions Hello, Have the following questions related to nodetool repair: 1. I know that Nodetool Repair Interval has to be less than GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds and 'Nodetool Repair Interval'. What factors would want me to change the default of 10 days of GCGraceSeconds. Similarly what factors would want me to keep Nodetool Repair Interval to be just slightly less than GCGraceSeconds (say a day less). 2. Does a Nodetool Repair block any reads and writes on the node, while the repair is going on ? During repair, if I try to do an insert, will the insert wait for repair to complete first ? 3. I read that repair can impact your workload as it causes additional disk and cpu activity. But any details of the impact mechanism and any ballpark on how much the read/write performance deteriorates ? Thanks.
Re: Node repair questions
(not answering (1) right now, because it's more involved) 2. Does a Nodetool Repair block any reads and writes on the node, while the repair is going on ? During repair, if I try to do an insert, will the insert wait for repair to complete first ? It doesn't imply any blocking. It's roughly similar to compaction in its impact on nodes; in addition when data is streamed (if any) the impact should be similar to node bootstrapping. 3. I read that repair can impact your workload as it causes additional disk and cpu activity. But any details of the impact mechanism and any ballpark on how much the read/write performance deteriorates ? The compaction part will have an impact similar to regular compaction except it's read-only (no writing of new sstables). It is subject to compaction throttling if you run a version of Cassandra with compaction throttling. Streaming causes disk/networking load and is not yet rate limited like compaction. In addition be aware that repair can cause disk space usage to temporarily increase if there are significant differences to be repaired. -- / Peter Schuller
Re: Node repair questions
The more often you repair, the quicker it will be. The more often your nodes go down the longer it will be. Going to have to disagree a bit here. In most cases the cost of running through the data and calculating the merkle tree should be quite significant, and hopefully the differences should be fairly limited. The actual data being streamed can be a problem, but unless you have a situation where you are consistently going significantly out-of-synch and there is no read-repair, I wouldn't recommend more frequent repairs if your aim is to minimize the impact on the cluster. (In the general case, there will be exceptions.) Also to OP: In general, expect repairs to be more impactful on your cluster the bigger your data is in comparison to available memory used for caching. Basically the more cache reliant you are, the grater the impact of repairs (and compaction) will tend to be. -- / Peter Schuller