Re: Recovery from botched compaction

Jonathan Ellis Tue, 13 Apr 2010 08:55:39 -0700

On Sat, Apr 10, 2010 at 2:24 PM, Anthony Molinaro
<antho...@alumni.caltech.edu> wrote:
>  This is sort of a pre-emptive question as the compaction I'm doing hasn't
> failed yet but I expect it to any time now.  I have a cluster which has been
> storing user profile data for a client.  Recently I've had to go back and
> reload all the data again.  I wasn't watching diskspace, and on one of the
> nodes it went above 50% (which I recall was bad), to somewhere around 70%.
> I expect to most back with a compaction (as most of the data was the same
> so a compaction should remove old copies), and went ahead and started one
> with nodeprobe compact (using 0.5.0 on this cluster).  However, I do see
> that the disk usage is growing (it's at 91% now).


Right, it can't remove any old data, until the compacted version is written.

(This is where the 50% recommendation comes from: worst-case, the
compacted version will take up exactly as much space as it did before,
if there were no deletes or overwrites.)

> So when the disk fills up and this compaction crashes what can I do?
> I assume get a bigger disk, shut down the node, move the data and
> restart will work, but do I have other options?
> Which files can I ignore (ie, can I not move any of the *-tmp-* files)?
> Will my system be in a corrupt state?

It won't corrupt itself, and it will automatically r/m tmp files when
it starts up.

If the disk fills up entirely then the node will become unresponsive
even for reads which is something we plan to fix.
(https://issues.apache.org/jira/browse/CASSANDRA-809)

Otherwise there isn't a whole lot you can do with the "I need to put
more data on my machine than I have room for" scenario.

> This machine is one in a set of 6, and since I didn't choose tokens
> initially, they are very lopsided (ie, some use 20% of their disk, others
> 60-70%).  If I were to start moving tokens around would the machines short
> of space be able to anti-compact without filling up?  or does anti-compaction
> like compaction require 2x disk space?

Anticompaction requires as much space as the data being transferred,
so worst case of transferring 100% off would require 2x.

https://issues.apache.org/jira/browse/CASSANDRA-579 will fix this for
the anticompaction case.

-Jonathan

Re: Recovery from botched compaction

Reply via email to