If you are running a sequential repair (or have previously run a sequential
repair that is still running) Cassandra will still have the file
descriptors open for files in the snapshot it is using for the repair
operation.

>From the http://www.datastax.com/dev/blog/repair-in-cassandra

*Cassandra 1.2 introduced a new option to repair to help manage the
problems caused by the nodes all repairing with each other at the same
time, it is call a snapshot repair, or sequential repair. As of Cassandra
2.1, sequential repair is the default, and the old parallel repair an
option. Sequential repair has all of the nodes involved take a snapshot,
the snapshot lives until the repair finishes, and then is removed. By
taking a snapshot, repair can procede in a serial fashion, such that only
two nodes are ever comparing with each other at a time. This makes the
overall repair process slower, but decreases the burden placed on the
nodes, and means you have less impact on reads/writes to the system.*

On 16 March 2015 at 16:33, David Wahler <dwah...@indeed.com> wrote:

> On Mon, Mar 16, 2015 at 6:12 PM, Ben Bromhead <b...@instaclustr.com> wrote:
> > Cassandra will by default snapshot your data directory on the following
> > events:
> >
> > TRUNCATE and DROP schema events
> > when you run nodetool repair
> > when you run nodetool snapshot
> >
> > Snapshots are just hardlinks to existing SSTables so the only disk space
> > they take up is for files that have since been compacted away. Disk space
> > for snapshots will be freed when the last link to the files are removed.
> You
> > can remove all snapshots in a cluster using nodetool clearsnapshot
> >
> > Snapshots will fail if you are out of disk space (this is
> counterintuitive
> > to the above, but it is true), if you have not increased the number of
> > available file descriptors or if there are permissions issues.
> >
> > Out of curiosity, how often are you running repair?
>
> Thanks for the information. We're running repair once per week, as
> recommended by the Datastax documentation. The repair is staggered to
> run on one machine at a time with the --partitioner-range option in
> order to spread out the load.
>
> Running "nodetool clearsnapshot" doesn't free up any space. I'm
> guessing that because the snapshot files have been deleted from the
> filesystem, Cassandra thinks the snapshots are already gone. But
> because it still has the file descriptors open, the disk space hasn't
> actually been reclaimed.
>



-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Reply via email to