[ 
https://issues.apache.org/jira/browse/CASSANDRA-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809493#comment-13809493
 ] 

Robert Coli commented on CASSANDRA-6245:
----------------------------------------

I'm not sure if it's explained in more detail somewhere else, but I think the 
"nodetool refresh" documentation currently reads :

{quote}Loads newly placed SSTables on to the system without restart.{quote}

While I agree that an operator might be able to infer that, like copying over 
existing SSTables, copying over non-existent soon-to-exist SSTables *might* be 
dangerous, I assert that the latter case is significantly less obvious. The 
former obviously explicitly nukes the SSTables in question, because that's what 
overwriting a file means. The latter exposes one to implicit overwriting only 
because Cassandra doesn't check for file existence while flushing, and instead 
blindly overwrites. 

I feel that this blind overwrite is not inferable (or typical?), and that the 
proposed change will therefore reduce the chance of people assuming (as I did) 
that it is safe to copy files of whatever name into the datadir for "refresh" 
as long as they do not overwrite existing files.

Is there some reason why flush cannot just inflate its sequence by one if it 
does a simple file existence test and notices a non-live file at the filename 
about to be flushed? That would be an even simpler solution to my silent 
overwriting concern than a refresh staging directory, and would handle more 
cases. I would be surprised if we have an actual requirement for flush to 
blindly overwrite?

> "nodetool refresh" design is unsafe
> -----------------------------------
>
>                 Key: CASSANDRA-6245
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6245
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Robert Coli
>            Priority: Minor
>
> CASSANDRA-2991 added a "nodetool refresh" feature by which Cassandra is able 
> to discover non-live SSTables in the datadir and make them live.
> It does this by :
> 1) looking for SSTable files in the data dir
> 2) renaming SSTables it finds into the current SSTable id sequence
> This implementation is exposed to a race with a chance of silent data loss.
> 1) Node's SSTable id sequence is on sstable #2, the next table to flush will 
> get "2" as its numeric part
> 2) Copy SSTable with "2" as its numeric part into data dir
> 3) nodetool flush
> 4) notice that your "2" SSTable has been silently overwritten by a 
> just-flushed "2" SSTable
> 5) nodetool refresh would still succeed, but would now be a no-op
> A simple solution would be to create a subdirectory of the datadir called 
> "refresh/" to serve as the location to refresh from.
> Alternately/additionally, there is probably not really a compelling reason 
> for Cassandra to completely ignore existing files at write time.. a check for 
> existing files at a given index and inflating the index to avoid overwriting 
> them them seems trivial and inexpensive. I will gladly file a JIRA for this 
> change in isolation if there is interest.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to