Robert Coli created CASSANDRA-6245:
--------------------------------------

             Summary: "nodetool refresh" design is unsafe
                 Key: CASSANDRA-6245
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6245
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Robert Coli
            Priority: Minor


CASSANDRA-2991 added a "nodetool refresh" feature by which Cassandra is able to 
discover non-live SSTables in the datadir and make them live.

It does this by :

1) looking for SSTable files in the data dir
2) renaming SSTables it finds into the current SSTable id sequence

This implementation is exposed to a race with a chance of silent data loss.

1) Node's SSTable id sequence is on sstable #2, the next table to flush will 
get "2" as its numeric part
2) Copy SSTable with "2" as its numeric part into data dir
3) nodetool flush
4) notice that your "2" SSTable has been silently overwritten by a just-flushed 
"2" SSTable
5) nodetool refresh would still succeed, but would now be a no-op

A simple solution would be to create a subdirectory of the datadir called 
"refresh/" to serve as the location to refresh from.

Alternately/additionally, there is probably not really a compelling reason for 
Cassandra to completely ignore existing files at write time.. a check for 
existing files at a given index and inflating the index to avoid overwriting 
them them seems trivial and inexpensive. I will gladly file a JIRA for this 
change in isolation if there is interest.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to