Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "Operations" page has been changed by RobertColi. The comment on this change is: removed the last reference to flushing before doing a snapshot.. also added a line clarifying that snapshot triggers the flush, and what this means about what data is or isn't in the snapshot.... http://wiki.apache.org/cassandra/Operations?action=diff&rev1=62&rev2=63 -------------------------------------------------- The reason why you run `nodetool cleanup` on all live nodes is to remove old Hinted Handoff writes stored for the dead node. == Backing up data == - Cassandra can snapshot data while online using `nodetool snapshot`. You can then back up those snapshots using any desired system, although leaving them where they are is probably the option that makes the most sense on large clusters. + Cassandra can snapshot data while online using `nodetool snapshot`. You can then back up those snapshots using any desired system, although leaving them where they are is probably the option that makes the most sense on large clusters. `nodetool snapshot` triggers a node-wide flush, so all data written before the execution of the snapshot command is contained within the snapshot. With some combinations of operating system/jvm you may receive an error related to the inability to create a process during the snapshotting, such as this on Linux @@ -128, +128 @@ To revert to a snapshot, shut down the node, clear out the old commitlog and sstables, and move the sstables from the snapshot location to the live data directory. === Consistent backups === - You can get an eventually consistent backup by flushing all nodes and snapshotting; no individual node's backup is guaranteed to be consistent but if you restore from that snapshot then clients will get eventually consistent behavior as usual. + You can get an eventually consistent backup by snapshotting all node; no individual node's backup is guaranteed to be consistent but if you restore from that snapshot then clients will get eventually consistent behavior as usual. There is no such thing as a consistent view of the data in the strict sense, except in the trivial case of writes with consistency level = ALL.
