[ 
https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867995#comment-17867995
 ] 

Stefan Miklosovic commented on CASSANDRA-18111:
-----------------------------------------------

_If one snapshot directory was accidentally removed, we don't want to propagate 
this error to the remaining snapshot directories._

But on the contrary ... how are you going to fix it? So you end up with two 
dirs out of three with SSTables. So now what? How are you going to restore from 
that snapshot? Because if you restore it, you just don't get it all? What is a 
snapshot which does not restore the data as it were good for? The results would 
be misleading, you would probably need to repair after restore etc ... 

I am not completely persuaded about your point that we should not delete the 
snapshots when the manual removal is detected. Because if you do that (or 
rather, don't), then a user does not have any visibility into what broken 
snapshots he has. These snapshots, when removed from memory, would be 
"orphaned", because we cache it all in memory, so there is a disconnection 
between what is in memory and what is on disk and that is the very reason I was 
introducing this. How would you make it that a user would be informed about 
corrupted snapshots which miss one of their data dirs?

Also, if you have some files missing, the next step we want to do here, I was 
discussing this with [~frankgh] / [~yifanc] is that we might extend the 
manifest to contain all the files etc. and then we can do a check (via 
extension of nodetool listsnapshots or similar) that what is in the manifest is 
indeed located on the disk in non-corrupted state (checksums same etc). So you 
can check the consistency of that. How would you detect that you are completely 
missing one of the directories? What if we removed the dir with the manifest? 
Then other two would not contain any ... That is probably additional argument 
to contain the manifest in every data dir. 

_Wouldn't we just need to watch 3 parent snapshot directories with the current 
implementation versus 10k files if we were to monitor snapshot manifests 
instead?_
{code:java}
data1/ks1/tb1/snapshots/snapshot1/_files_
data2/ks1/tb1/snapshots/snapshot1/_files_
data3/ks1/tb1/snapshots/snapshot1/_files_  {code}
Now, we watch 3 dirs. 
{code:java}
data1/ks1/tb1/snapshots
data2/ks1/tb1/snapshots
data3/ks1/tb1/snapshots{code}
And in each of the dirs, we react on a deletion of some snapshot dir.

So yes, you are actually right, it would be 3 root snapshot dirs and detecting 
removal of a dir in each vs. 30k snapshot manifests when we have 10k snapshots 
and we want introduce a manifest in each.

> Centralize all snapshot operations to SnapshotManager and cache snapshots
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18111
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18111
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Everytime {{nodetool listsnapshots}} is called, all data directories are 
> scanned to find snapshots, what is inefficient.
> For example, fetching the 
> {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric 
> can take half a second (CASSANDRA-13338).
> This improvement will also allow snapshots to be efficiently queried via 
> virtual tables (CASSANDRA-18102).
> In order to do this, we should:
> a) load all snapshots from disk during initialization
> b) keep a collection of snapshots on {{SnapshotManager}}
> c) update the snapshots collection anytime a new snapshot is taken or cleared
> d) detect when a snapshot is manually removed from disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to