Re: nodetool repair caused high disk space usage

Philippe Sat, 20 Aug 2011 23:26:16 -0700

>
> Do you have an indication that at least the disk space is in fact
> consistent with the amount of data being streamed between the nodes? I
> think you had 90 -> ~ 450 gig with RF=3, right? Still sounds like a
> lot assuming repairs are not running concurrently (and compactions are
> able to run after a repair before the next repair of a neighbor
> starts).
>
Hi Peter,
When a repair was running on the 40GB keyspace I'd usually see range repairs
for about up to a couple thousand ranges for each CF. If range = #keys then
that's a very small amount of data being moved around.
However, at the time, I hadn't noticed that there were multiple repairs
running concurrently on the same nodes and on the neighbors so I suppose my
experience is invalid for possibly finding a bug. But I suspect it will help
someone out along the way because they'll have multiple repairs going on too
and I have a much better understanding of what's going on myself.


I've reloaded all my data in my cluster now, the load is 140GB on each node
and I've been able to run a repair on each CF that comes out almost 100%
consistent. I'm now starting to run the daily repair crons again to see if
they go out of whack or not.

Re: nodetool repair caused high disk space usage

Reply via email to