jackcasey-visier opened a new pull request #1148:
URL: https://github.com/apache/cassandra/pull/1148


   *These changes are a current WIP*
   
   # Summary
   
   This functionality allows users of Cassandra to remove snapshots ad-hoc, 
based on a TTL. This is to address the problem of snapshots accumulating. For 
example, an organization I work for aims to keep snapshots for 30 days, however 
we don't have any way to easily clean them after those 30 days are up. 
   
   This is similar to the goals set in: 
https://issues.apache.org/jira/browse/CASSANDRA-16451 however would be 
available for Cassandra 3.x. 
   
   # Functionality 
   
   This adds a new command to NodeTool, called `expiresnapshot` with the 
following options: 
   
   ```
   NAME
           nodetool expiresnapshots - Removes snapshots that are older than a 
TTL
           in days
   
   SYNOPSIS
           nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)]
                   [(-pw <password> | --password <password>)]
                   [(-pwf <passwordFilePath> | --password-file 
<passwordFilePath>)]
                   [(-u <username> | --username <username>)] expiresnapshots 
[--dry-run]
                   (-t <ttl> | --ttl <ttl>)
   
   OPTIONS
           --dry-run
               Run without actually clearing snapshots
   
           -h <host>, --host <host>
               Node hostname or ip address
   
           -p <port>, --port <port>
               Remote jmx agent port number
   
           -pw <password>, --password <password>
               Remote jmx agent password
   
           -pwf <passwordFilePath>, --password-file <passwordFilePath>
               Path to the JMX password file
   
           -t <ttl>, --ttl <ttl>
               TTL (in days) to expire snapshots
   
           -u <username>, --username <username>
               Remote jmx agent username
   ```
   
   The snapshot date is taken by converting the default snapshot name 
timestamps (epoch time in miliseconds). For this reason, snapshot names that 
don't contain a timestamp in this format will not be cleared. 
   
   # Example Use 
   
   This Cassandra environment has a number of snapshots, a few are recent, and 
a few outdated:
   
   ```
   root@cassandra001:/cassandra# nodetool listsnapshots
   Snapshot Details:
   Snapshot name Keyspace name  Column family name True size  Size on disk
   1529173922063 users_keyspace users              362.03 KiB 362.89 KiB
   1629173909461 users_keyspace users              362.03 KiB 362.89 KiB
   1629173922063 users_keyspace users              362.03 KiB 362.89 KiB
   1599173922063 users_keyspace users              362.03 KiB 362.89 KiB
   1629173916816 users_keyspace users              362.03 KiB 362.89 KiB
   
   Total TrueDiskSpaceUsed: 1.77 MiB
   ```
   
   To validate the removal runs as expected, we can use the `--dry-run` option: 
   
   ```
   root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30 --dry-run
   Starting simulated cleanup of snapshots older than 30 days
   Clearing (dry run): 1529173922063
   Clearing (dry run): 1599173922063
   Cleared (dry run): 2 snapshots
   ```
   
   Now that we are confident the correct snapshots will be removed, we can omit 
the `--dry-run` flag: 
   
   ```
   root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30
   Starting cleanup of snapshots older than 30 days
   Clearing: 1529173922063
   Clearing: 1599173922063
   Cleared: 2 snapshots
   ```
   
   To confirm our changes are successful, we list the snapshots that still 
remain: 
   
   ```
   root@cassandra001:/cassandra# nodetool listsnapshots
   Snapshot Details:
   Snapshot name Keyspace name  Column family name True size  Size on disk
   1629173909461 users_keyspace users              362.03 KiB 362.89 KiB
   1629173922063 users_keyspace users              362.03 KiB 362.89 KiB
   1629173916816 users_keyspace users              362.03 KiB 362.89 KiB
   
   Total TrueDiskSpaceUsed: 1.06 MiB
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to