Re: Deleted snapshot files filling up /var/lib/cassandra
Sorry for the late reply. To immediately solve the problem you can restart Cassandra and all the open file descriptors to the deleted snapshots should disappear. As for why it happened I would first address the disk space issue and see if the snapshot errors + open file descriptors issue still occurs (I am unclear as to whether you got the snapshot exception after the disk filled up or before), if you still have issues with repair not letting go of snapshotted files even with free disk space I would look to raise a ticket in Jira. On 17 March 2015 at 12:46, David Wahler dwah...@indeed.com wrote: On Mon, Mar 16, 2015 at 6:51 PM, Ben Bromhead b...@instaclustr.com wrote: If you are running a sequential repair (or have previously run a sequential repair that is still running) Cassandra will still have the file descriptors open for files in the snapshot it is using for the repair operation. Yeah, that aligns with my understanding of how the repair process works. But the cluster has no repair sessions active (I think; when I run nodetool tpstats, the AntiEntropyStage and AntiEntropySessions values are zero on all nodes) and the space still hasn't been freed. -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Deleted snapshot files filling up /var/lib/cassandra
On Mon, Mar 16, 2015 at 6:51 PM, Ben Bromhead b...@instaclustr.com wrote: If you are running a sequential repair (or have previously run a sequential repair that is still running) Cassandra will still have the file descriptors open for files in the snapshot it is using for the repair operation. Yeah, that aligns with my understanding of how the repair process works. But the cluster has no repair sessions active (I think; when I run nodetool tpstats, the AntiEntropyStage and AntiEntropySessions values are zero on all nodes) and the space still hasn't been freed.
Re: Deleted snapshot files filling up /var/lib/cassandra
If you are running a sequential repair (or have previously run a sequential repair that is still running) Cassandra will still have the file descriptors open for files in the snapshot it is using for the repair operation. From the http://www.datastax.com/dev/blog/repair-in-cassandra *Cassandra 1.2 introduced a new option to repair to help manage the problems caused by the nodes all repairing with each other at the same time, it is call a snapshot repair, or sequential repair. As of Cassandra 2.1, sequential repair is the default, and the old parallel repair an option. Sequential repair has all of the nodes involved take a snapshot, the snapshot lives until the repair finishes, and then is removed. By taking a snapshot, repair can procede in a serial fashion, such that only two nodes are ever comparing with each other at a time. This makes the overall repair process slower, but decreases the burden placed on the nodes, and means you have less impact on reads/writes to the system.* On 16 March 2015 at 16:33, David Wahler dwah...@indeed.com wrote: On Mon, Mar 16, 2015 at 6:12 PM, Ben Bromhead b...@instaclustr.com wrote: Cassandra will by default snapshot your data directory on the following events: TRUNCATE and DROP schema events when you run nodetool repair when you run nodetool snapshot Snapshots are just hardlinks to existing SSTables so the only disk space they take up is for files that have since been compacted away. Disk space for snapshots will be freed when the last link to the files are removed. You can remove all snapshots in a cluster using nodetool clearsnapshot Snapshots will fail if you are out of disk space (this is counterintuitive to the above, but it is true), if you have not increased the number of available file descriptors or if there are permissions issues. Out of curiosity, how often are you running repair? Thanks for the information. We're running repair once per week, as recommended by the Datastax documentation. The repair is staggered to run on one machine at a time with the --partitioner-range option in order to spread out the load. Running nodetool clearsnapshot doesn't free up any space. I'm guessing that because the snapshot files have been deleted from the filesystem, Cassandra thinks the snapshots are already gone. But because it still has the file descriptors open, the disk space hasn't actually been reclaimed. -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Deleted snapshot files filling up /var/lib/cassandra
We have a 16-node, globally-distributed cluster. running Cassandra 2.0.12. We're using the Datastax packages on CentOS 6.5. Even though the total amount of data on each server is only a few hundred MB (as measured by both du and the load metric), we're seeing a problem where the disk usage is steadily increasing and eventually filling up the 10GB /var/lib/cassandra partition. Running lsof on the Cassandra process shows that it has open file handles for thousands of deleted snapshot files: $ sudo lsof -p 4753 | grep DEL -c 13314 $ sudo lsof -p 4753 | grep DEL | head java 4753 cassandra DEL REG 253,6 538873 /var/lib/cassandra/data/keyspace/cf/snapshots/65bc8170-cc20-11e4-a355-0d37e54cc22e/keyspace-cf-jb-3979-Index.db java 4753 cassandra DEL REG 253,6 538899 /var/lib/cassandra/data/keyspace/cf/snapshots/8cb41770-cc20-11e4-a355-0d37e54cc22e/keyspace-cf-jb-3983-Index.db ...etc... We're not manually creating these snapshots; they're being generated by periodic runs of nodetool repair -pr. There are some errors in system.log that seem to be related: ERROR [RepairJobTask:10] 2015-03-16 02:02:12,485 RepairJob.java (line 143) Error occurred during snapshot phase java.lang.RuntimeException: Could not create snapshot at /10.1.1.188 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:81) at org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:344) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [AntiEntropySessions:4] 2015-03-16 02:02:12,486 RepairSession.java (line 288) [repair #55a8eb50-cbaa-11e4-9af9-27d7677e5965] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:323) at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:144) at com.google.common.util.concurrent.Futures$4.run(Futures.java:1160) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [AntiEntropySessions:4] 2015-03-16 02:02:12,488 CassandraDaemon.java (line 199) Exception in thread Thread[AntiEntropySessions:4,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:323) at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:144) at com.google.common.util.concurrent.Futures$4.run(Futures.java:1160) ... 3 more Has anyone encountered this problem before? The same stack trace shows up in CASSANDRA-8020, but that bug was supposedly introduced in 2.1.0 and fixed in 2.1.1. In any case, we don't want to upgrade to 2.1.x, since the consensus on this list seems to be that it's not yet production-ready. I'm fairly new to Cassandra, so general troubleshooting tips would also be much appreciated. Thanks, -- David
Re: Deleted snapshot files filling up /var/lib/cassandra
On Mon, Mar 16, 2015 at 5:28 PM, Jan cne...@yahoo.com wrote: David; all the packaged installations use the /var/lib/cassandra directory. Could you check your yaml config files and see if you are using this default directory for backups May want to change it to a location with more disk space. We're using the default /var/lib/cassandra as our data directory, mounted as its own LVM volume. I don't see anything in cassandra.yaml about a backup directory. There is an incremental_backups option which is set to false. Increasing the available disk space doesn't really seem like a solution. We have only about 450MB of live data on the most heavily-loaded server, and the space taken up by these deleted files is growing by several GB per day. For now we can work around the problem by periodically restarting servers to close the file handles, but that hurts our availability and seems like a hack.
Re: Deleted snapshot files filling up /var/lib/cassandra
Cassandra will by default snapshot your data directory on the following events: - TRUNCATE and DROP schema events - when you run nodetool repair - when you run nodetool snapshot Snapshots are just hardlinks to existing SSTables so the only disk space they take up is for files that have since been compacted away. Disk space for snapshots will be freed when the last link to the files are removed. You can remove all snapshots in a cluster using nodetool clearsnapshot Snapshots will fail if you are out of disk space (this is counterintuitive to the above, but it is true), if you have not increased the number of available file descriptors or if there are permissions issues. Out of curiosity, how often are you running repair? On 16 March 2015 at 15:52, David Wahler dwah...@indeed.com wrote: On Mon, Mar 16, 2015 at 5:28 PM, Jan cne...@yahoo.com wrote: David; all the packaged installations use the /var/lib/cassandra directory. Could you check your yaml config files and see if you are using this default directory for backups May want to change it to a location with more disk space. We're using the default /var/lib/cassandra as our data directory, mounted as its own LVM volume. I don't see anything in cassandra.yaml about a backup directory. There is an incremental_backups option which is set to false. Increasing the available disk space doesn't really seem like a solution. We have only about 450MB of live data on the most heavily-loaded server, and the space taken up by these deleted files is growing by several GB per day. For now we can work around the problem by periodically restarting servers to close the file handles, but that hurts our availability and seems like a hack. -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Deleted snapshot files filling up /var/lib/cassandra
On Mon, Mar 16, 2015 at 6:12 PM, Ben Bromhead b...@instaclustr.com wrote: Cassandra will by default snapshot your data directory on the following events: TRUNCATE and DROP schema events when you run nodetool repair when you run nodetool snapshot Snapshots are just hardlinks to existing SSTables so the only disk space they take up is for files that have since been compacted away. Disk space for snapshots will be freed when the last link to the files are removed. You can remove all snapshots in a cluster using nodetool clearsnapshot Snapshots will fail if you are out of disk space (this is counterintuitive to the above, but it is true), if you have not increased the number of available file descriptors or if there are permissions issues. Out of curiosity, how often are you running repair? Thanks for the information. We're running repair once per week, as recommended by the Datastax documentation. The repair is staggered to run on one machine at a time with the --partitioner-range option in order to spread out the load. Running nodetool clearsnapshot doesn't free up any space. I'm guessing that because the snapshot files have been deleted from the filesystem, Cassandra thinks the snapshots are already gone. But because it still has the file descriptors open, the disk space hasn't actually been reclaimed.