Matt Byrd created CASSANDRA-21173:
-------------------------------------

             Summary: Snapshots from tables without table-id embedded in their 
folder name are not loaded by SnapshotLoader
                 Key: CASSANDRA-21173
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21173
             Project: Apache Cassandra
          Issue Type: Bug
          Components: Local/Snapshots, Local/Startup and Shutdown
            Reporter: Matt Byrd
            Assignee: Matt Byrd


Tables created prior to 2.1 do not have a table-id embedded in their table 
folder name.
This is handled correctly in Directories.java (see constructor) unfortunately 
in SnapshotLoader, we use a regex which attempts to extract the table-id and 
hence skips over any tables created prior to 2.1.
The end result is that these tables are not visible in list snapshot and more 
importantly cannot be cleared via nodetool clearsnapshot. This was noticed upon 
major upgrade to 5.0.

I've observed this on 5.0, from reading the code it appears likely improved in 
5.1, in that it now requires a restart in addition to trigger.

Some related tickets:

Introduction of table-id and backwards compatible handling of old folders 
originally here:
https://issues.apache.org/jira/browse/CASSANDRA-5202

Machinery to list snapshots which doesn’t handle old format was added here:
https://issues.apache.org/jira/browse/CASSANDRA-16843
https://github.com/apache/cassandra/commit/31aa17a2a3b18bdda723123cad811f075287807d
There was some discussion at the time of not handling pre 2.1 tables here:
https://issues.apache.org/jira/browse/CASSANDRA-16843?focusedCommentId=17440088&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17440088

Then nodetool clearsnapshot stopped working here with:
https://issues.apache.org/jira/browse/CASSANDRA-17757

Things improve a bit in 5.1 with 
https://issues.apache.org/jira/browse/CASSANDRA-18111
Now we no longer try and load the snapshots via SnapshotLoader in entirety 
before deciding if we can clear them, but instead make use of SnapshotManager. 
Whilst snapshots taken while the jvm is running are now visible and clearable, 
from reading upon restart we lose that information and cannot view/clear 
snapshots created before the restart.

One solution to handle these pre 2.1 tables, is to include the table-id in the 
manifest.json, then we'll be able to read this information if not available 
from folder name upon restart.

Another possibility which doesn't fix as many problems, is just to expose via 
jmx/nodetool
something to allow operators to bypass the snapshot loading mechanism and 
directly clear the old pre-2.1 snapshots.

A more involved and risky change would be to somehow think about how we migrate 
all this existing data in different folder structures to new consistent folder 
structure, but this seems quite involved and would likely deserve it's own JIRA 
at least.

I have a patch locally against trunk for the first approach, just storing the 
tableId in the manifest.json, which does this and will run it against CI.

I'll have a further think about if there are any other approaches, if anyone 
has any ideas let me know.
Another thing to consider is where we should apply this change.
Probably at a minimum 5.0, since that's where one can no longer nodetool 
clearsnapshot on certain tables and the effect is a bit worse there than in 5.1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to