Node repair questions

2011-07-11 Thread A J
Hello,
Have the following questions related to nodetool repair:
1. I know that Nodetool Repair Interval has to be less than
GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds
and 'Nodetool Repair Interval'. What factors would want me to change
the default of 10 days of GCGraceSeconds. Similarly what factors would
want me to keep Nodetool Repair Interval to be just slightly less than
GCGraceSeconds (say a day less).

2. Does a Nodetool Repair block any reads and writes on the node,
while the repair is going on ? During repair, if I try to do an
insert, will the insert wait for repair to complete first ?

3. I read that repair can impact your workload as it causes additional
disk and cpu activity. But any details of the impact mechanism and any
ballpark on how much the read/write performance deteriorates ?

Thanks.


RE: Node repair questions

2011-07-11 Thread Jeremiah Jordan
The more often you repair, the quicker it will be.  The more often your
nodes go down the longer it will be.

Repair streams data that is missing between nodes.  So the more data
that is different the longer it will take.  Your workload is impacted
because the node has to scan the data it has to be able to compare with
other nodes, and if there are differences, it has to send/receive data
from other nodes.


-Original Message-
From: A J [mailto:s5a...@gmail.com] 
Sent: Monday, July 11, 2011 2:43 PM
To: user@cassandra.apache.org
Subject: Node repair questions

Hello,
Have the following questions related to nodetool repair:
1. I know that Nodetool Repair Interval has to be less than
GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds
and 'Nodetool Repair Interval'. What factors would want me to change the
default of 10 days of GCGraceSeconds. Similarly what factors would want
me to keep Nodetool Repair Interval to be just slightly less than
GCGraceSeconds (say a day less).

2. Does a Nodetool Repair block any reads and writes on the node, while
the repair is going on ? During repair, if I try to do an insert, will
the insert wait for repair to complete first ?

3. I read that repair can impact your workload as it causes additional
disk and cpu activity. But any details of the impact mechanism and any
ballpark on how much the read/write performance deteriorates ?

Thanks.


Re: Node repair questions

2011-07-11 Thread Peter Schuller
(not answering (1) right now, because it's more involved)

 2. Does a Nodetool Repair block any reads and writes on the node,
 while the repair is going on ? During repair, if I try to do an
 insert, will the insert wait for repair to complete first ?

It doesn't imply any blocking. It's roughly similar to compaction in
its impact on nodes; in addition when data is streamed (if any) the
impact should be similar to node bootstrapping.

 3. I read that repair can impact your workload as it causes additional
 disk and cpu activity. But any details of the impact mechanism and any
 ballpark on how much the read/write performance deteriorates ?

The compaction part will have an impact similar to regular compaction
except it's read-only (no writing of new sstables). It is subject to
compaction throttling if you run a version of Cassandra with
compaction throttling.

Streaming causes disk/networking load and is not yet rate limited like
compaction.

In addition be aware that repair can cause disk space usage to
temporarily increase if there are significant differences to be
repaired.

-- 
/ Peter Schuller


Re: Node repair questions

2011-07-11 Thread Peter Schuller
 The more often you repair, the quicker it will be.  The more often your
 nodes go down the longer it will be.

Going to have to disagree a bit here. In most cases the cost of
running through the data and calculating the merkle tree should be
quite significant, and hopefully the differences should be fairly
limited.

The actual data being streamed can be a problem, but unless you have a
situation where you are consistently going significantly out-of-synch
and there is no read-repair, I wouldn't recommend more frequent
repairs if your aim is to minimize the impact on the cluster. (In the
general case, there will be exceptions.)

Also to OP: In general, expect repairs to be more impactful on your
cluster the bigger your data is in comparison to available memory used
for caching. Basically the more cache reliant you are, the grater the
impact of repairs (and compaction) will tend to be.

-- 
/ Peter Schuller